Digital Library for the UC Community and Beyond CENIC 2011 Pa t r - - PowerPoint PPT Presentation

digital library for the uc community and
SMART_READER_LITE
LIVE PREVIEW

Digital Library for the UC Community and Beyond CENIC 2011 Pa t r - - PowerPoint PPT Presentation

Publishing, Delivery, and Curation: a Selection of Services from the California Digital Library for the UC Community and Beyond CENIC 2011 Pa t r i c i a C r u s e , D i re c t o r, U C 3 , C D L L i s a S c h i f f, Te c h n i ca l L e


slide-1
SLIDE 1

Publishing, Delivery, and Curation: a Selection of Services from the California Digital Library for the UC Community and Beyond

CENIC 2011

Pa t r i c i a C r u s e , D i re c t o r, U C 3 , C D L L i s a S c h i f f, Te c h n i ca l L e a d , P u b l i s h i n g G ro u p , C D L A d r i a n Tu r n e r, D a ta C o n s u l ta n t , D i g i ta l S p e c i a l C o l l e c t i o n s , C D L

slide-2
SLIDE 2

Roadmap for today’s talk

  • Who we are
  • Technical Infrastructure
  • Highlighted Services
  • eScholarship
  • Calisphere
  • Merritt
  • EZID
  • Questions

California Digital Library @ the University of California

slide-3
SLIDE 3

California Digital Library FAQ

  • Established 1997
  • support the UC’s pursuit
  • f scholarship
  • extend the University’s

public service mission

  • Partner in research

– knowledge creation – collecting – publishing – discovering – sharing, and saving data for use by future scholars.

California Digital Library @ the University of California

slide-4
SLIDE 4

California Digital Library @ the University of California

CDL’s Programs and Services

Programs:

  • Collections
  • Digital Special Collections
  • Discovery & Delivery
  • Publishing Group
  • University of California Curation

Center (UC3) Services:

  • Business Services
  • Information Services
  • Infrastructure & Applications

Support Services

  • Strategic and Project Planning

Services

  • User Experience Design Services
slide-5
SLIDE 5

California Digital Library @ the University of California

The current landscape

Ever increasing number, size, and diversity of content – More stuff, less resources Ever increasing diversity of partners, stakeholders, and expectations – Producers / consumers  prosumers / conducers Inevitability of disruptive change – Technology – User expectation – Institutional mission and resources Problem or opportunity?

$ Work

Time

slide-6
SLIDE 6

California Digital Library @ the University of California

CDL Technical Infrastructure (1)

  • “Physical” Infrastructure

– Dev -> Stage -> Production path

  • Dev/Stage servers collocated at UC Berkeley

Datacenter, managed by CDL

  • Production data center at UCOP, shared management

– Sun/Solaris-centric -> VM/Linux-centric – Monitoring via Groundwork – Version control: CVS + Distributed Systems – Open Source preference

slide-7
SLIDE 7

California Digital Library @ the University of California

CDL Technical Infrastructure (2)

  • Management Structure

– Technical Leads – Infrastructure and Applications Support – TechCouncil

  • Shared tools and practices, as much as

feasible

– “cdlcommon” – Tech All Hands – TechTalks

slide-8
SLIDE 8

California Digital Library @ the University of California

CDL Infrastructure Map

slide-9
SLIDE 9

California Digital Library @ the University of California

eScholarship

slide-10
SLIDE 10

California Digital Library @ the University of California

Open Access Publishing Services

  • Digital Publishing Services

–Journals –Books/UCPubS –Working Papers –Conference Proceedings –Seminar/Paper Series

  • Traditional “Repository” Services

–Postprints

slide-11
SLIDE 11

California Digital Library @ the University of California

Some Journals in eScholarship

slide-12
SLIDE 12

California Digital Library @ the University of California

Some Research Units in eScholarship

slide-13
SLIDE 13

California Digital Library @ the University of California

eScholarship Adds Value To Publications…

  • Increasing discoverability (Google & external

sites)

  • Adding credibility via clear branding at system,

campus and unit levels

  • Providing a wide variety of robust search,

browse and “evaluation” tools

slide-14
SLIDE 14

California Digital Library @ the University of California

slide-15
SLIDE 15

California Digital Library @ the University of California

slide-16
SLIDE 16

California Digital Library @ the University of California

slide-17
SLIDE 17

California Digital Library @ the University of California

slide-18
SLIDE 18

California Digital Library @ the University of California

slide-19
SLIDE 19

California Digital Library @ the University of California

slide-20
SLIDE 20

California Digital Library @ the University of California

eScholarship System Architecture

Submission and publishing backend (Vendor: bepress) eScholarship XTF based access and display interface eScholarship Harvester Additional content sources:

  • Springer --ftp deposit
  • BioMed Central--SWORD

deposit)

  • ETDs (to come)—Merritt

Atom feed Submission to additional access points:

  • UC’s WorldCat
  • PubMed
  • RePEc
  • Others as available

Merritt Preservation Repository OAI-PMH/Web Interface

slide-21
SLIDE 21

California Digital Library @ the University of California

eScholarship Foundational Technologies

  • XTF—eXtensible Text Framework

– Access interface – CDL technology

  • Digital Commons & EdiKit

– Submission and manuscript management – Vendor hosted: www.bepress.com

  • Merritt micro-services

– Preservation – CDL technology More information here: http://www.escholarship.org/help_documentation.html

slide-22
SLIDE 22

California Digital Library @ the University of California

What is XTF?

  • A framework for building robust digital content

applications

  • Java Servlets + Lucene + Saxon + XSLT 2.0 = XTF
  • Used by institutions worldwide
  • Open source  http://xtf.cdlib.org/download
  • Maintained and supported by CDL developers
slide-23
SLIDE 23

California Digital Library @ the University of California

Implemented Worldwide

slide-24
SLIDE 24

California Digital Library @ the University of California

XTF Key Features

  • Robust search

– Collection and within-document – Boolean commands, truncation/wildcard operators, and exact phrases. – Structure-aware searching (e.g., search only this chapter) – Highlighting of search hits in context – Spell-checking of search terms

  • Bookbags
  • Similar item suggestions
  • Hierarchical facets
  • Enhanced presentations of individual papers
  • Globalization – users choose interface language
  • OpenLibrary Book Reader
  • OAI-PMH interface
  • Image Zoomer (Coming late Spring)
slide-25
SLIDE 25

California Digital Library @ the University of California

XTF System Architecture

slide-26
SLIDE 26

California Digital Library @ the University of California

http://xtf.cdlib.org

slide-27
SLIDE 27

California Digital Library @ the University of California

Calisphere

slide-28
SLIDE 28

California Digital Library @ the University of California

Calisphere

  • Over 200 institutions
  • Open to California libraries, museums, archives, and

historical societies

  • Collection strengths in state and Western history
  • Access to 210,000 digitized primary sources
  • Target audiences: College and K-12 educators, students
slide-29
SLIDE 29

California Digital Library @ the University of California

Calisphere Technologies

  • XTF—eXtensible Text Framework

– Access interface – CDL technology

  • Submission and hosting format:

– METS (Metadata and Encoding Transmission Standard) More information here: http://www.cdlib.org/services/dsc/technical.html

slide-30
SLIDE 30

California Digital Library @ the University of California

Title

slide-31
SLIDE 31

California Digital Library @ the University of California

Title

  • Point
  • Point
slide-32
SLIDE 32

California Digital Library @ the University of California

Title

  • Point
  • Point
slide-33
SLIDE 33

California Digital Library @ the University of California

Title

  • Point
  • Point
slide-34
SLIDE 34

California Digital Library @ the University of California

Title

  • Point
  • Point
slide-35
SLIDE 35

California Digital Library @ the University of California

slide-36
SLIDE 36

California Digital Library @ the University of California

System Architecture

Institution submits content Calisphere repository Merritt repository XTF based access and display interface

METS

  • OAI-PMH
  • API (forthcoming)
slide-37
SLIDE 37

California Digital Library @ the University of California

slide-38
SLIDE 38

California Digital Library @ the University of California

by era by people by topical terms

slide-39
SLIDE 39

California Digital Library @ the University of California

slide-40
SLIDE 40

California Digital Library @ the University of California

slide-41
SLIDE 41

California Digital Library @ the University of California

slide-42
SLIDE 42

California Digital Library @ the University of California

slide-43
SLIDE 43

California Digital Library @ the University of California

slide-44
SLIDE 44

California Digital Library @ the University of California

by location

slide-45
SLIDE 45

California Digital Library @ the University of California

slide-46
SLIDE 46

California Digital Library @ the University of California

slide-47
SLIDE 47

California Digital Library @ the University of California

slide-48
SLIDE 48

California Digital Library @ the University of California

slide-49
SLIDE 49

California Digital Library @ the University of California

Title

  • Point
  • Point
slide-50
SLIDE 50

California Digital Library @ the University of California

Calisphere Search Widget (forthcoming)

slide-51
SLIDE 51

California Digital Library @ the University of California

Historical Citrus Photos

Search here

slide-52
SLIDE 52

California Digital Library @ the University of California

Historical Citrus Photos

Search here

slide-53
SLIDE 53

California Digital Library @ the University of California

slide-54
SLIDE 54

California Digital Library @ the University of California

University of California Curation Center

Creative partnership between the CDL, the 10 UC campuses, and other peer institutions

– A community of shared concern and practice – A channel to pool and distribute diverse experience, expertise, and resources – Robust, innovative, and cost-effective solutions to counteract inevitable disruptive change

Ken Spraque, The Parable of the Fishes

slide-55
SLIDE 55

California Digital Library @ the University of California

Content snapshot

DPR plus Growth of managed content*

0.000 10.000 20.000 30.000 40.000 50.000 60.000 70.000 80.000 90.000 TB Merritt DPR UCTV VAT

*excludes content waiting in the wings and web content

Institutions 93 Collections 131 Objects 54,697 Files 1,619,190 Total size (TB) 78.96

Web Archiving Service

Institutions 18 Users 105 Archives 108 Sites captured 3,885 Captures run 26,463 Total size (TB) 17,839

slide-56
SLIDE 56

California Digital Library @ the University of California

Three questions or imperatives?

  • How can we best respond organizationally?
  • How does our technical landscape change?
  • How can we build new communities?
slide-57
SLIDE 57

California Digital Library @ the University of California

Diversity of stakeholders…

UC Curation Center

Faculty / researchers Organized research units Libraries Museums IT / data centers National / international libraries Private sector Non-profit Academic institutions

UC community and External Partners

slide-58
SLIDE 58

California Digital Library @ the University of California

A dynamic technical approach

  • Simple and flexible technical infrastructure
  • Use third-party components
  • Outsource when necessary
slide-59
SLIDE 59

California Digital Library @ the University of California

Technical imperatives

Provide innovative, effective, and efficient services Plan for change

– Focus on content, not the systems in which that content is managed

 Systems come and go (but not our system ;-)

– Occam’s Razor and Murphy’s Law suggest

 Favor the small and simple over the large and complex  Favor the proven over the (merely) novel

Enable curation at the point of use Do more with less

slide-60
SLIDE 60

California Digital Library @ the University of California

Curation micro-services

Devolve curation function into a granular set of independent, but interoperable micro-services

– Each service small and self-contained; collectively easier to develop, maintain, and deploy – The level of investment is small; easier to replace – The scope of each service is limited; complex behavior can emerge from the strategic composition of individual atomistic services – All service interactions through public interfaces

slide-61
SLIDE 61

California Digital Library @ the University of California

Curation micro-services

Value

Annotation

  • f content by consumers

Notification

  • f new content availability

Access*

for retrieval

Transformation

to create derivatives

Service

Search

  • f content and metadata

Index

to enable fast search

Curation

Ingest

  • f content for curation

Preservation Context

Characterization*

to extract content properties

Inventory

  • f curated content

Replication*

for safety

State

Fixity*

to verify bit-level integrity

Storage

for long-term retention

Identity

for long-term reference *in beta

slide-62
SLIDE 62

California Digital Library @ the University of California

…a diversity of solutions

1. Consultation services

Expertise Guidelines Best practices

2. Hosted solutions 4. Partnerships 3. Campus solutions

Micro-services deployment

5. Community initiatives

Chronopolis Media Vault Program

Web Archiving Service

slide-63
SLIDE 63

California Digital Library @ the University of California

Merritt: a next generation service

a cost effective service that lets the UC community deposit, manage, archive, and share its valuable digital content

slide-64
SLIDE 64

California Digital Library @ the University of California

Merritt in action

Content deposit screen Deposit

format ready single, batch versions

Search Display

  • bject info
  • bject

parts

Download object Share Access Store / Preserve

slide-65
SLIDE 65

California Digital Library @ the University of California

Diversity of content…

CDL eScholarship

Open access publishing

Open Context

Archaeological

Minnesota Historical Society

Legislative history

Media Hub Program

Museum collections

California Digital Newspaper Collection

News media

Water Resource Center Archive

Environmental

UCTV

Multi-media

DataONE member node

Scientific

UC3 Web Archiving Service

Everything

UC3 legacy DPR collections

Anything

… and lots more!

slide-66
SLIDE 66

California Digital Library @ the University of California

Using Merritt

Dark archive for important digital assets

– UCTV

Bright archive with direct discovery and access

– Part of grant-funded research data sustainability plan

Preservation back-end for existing or new discovery and content management systems

– eScholarship, Media Hub, Open Context

Integration with distributed data grids

– Chronopolis, DataONE member node

Local deployments for special-purpose campus repositories

slide-67
SLIDE 67

California Digital Library @ the University of California

The research data problem

  • Journal article

– Uniquely and persistently identified – Concept of “publish” – Multiple copies – Easily findable – Additional services: impact metrics, citation tracking, etc.

  • Research data

– Nope – Not really – Typically one – Difficult – Nope  Second-class citizens in scholarly record

slide-68
SLIDE 68

California Digital Library @ the University of California

an article about data, but no data

slide-69
SLIDE 69

California Digital Library @ the University of California

FTP site

The hunt for the data…

slide-70
SLIDE 70

California Digital Library @ the University of California

EZID: long-term identifiers made easy

take control of the management and distribution of your research, share and get credit for it, and build your reputation through its collection and documentation

slide-71
SLIDE 71

California Digital Library @ the University of California

The EZID Service: a key tool for research

Say “ee-zee-eye-dee” please

Core Functions

  • Create persistent identifiers
  • Manage identifiers &

associated metadata over time Users

  • Individual researchers
  • Research groups
  • Data centers, archives,

repositories

  • Distributed projects,

consortia EZID in the field

  • Assisting data intensive

research

  • Helping a research team
  • Facilitating data publication
  • Managing the output of a

grant

slide-72
SLIDE 72

California Digital Library @ the University of California