Multilingual WorldCat presented by Janifer Gatenby Karen Smith - - PowerPoint PPT Presentation

multilingual worldcat
SMART_READER_LITE
LIVE PREVIEW

Multilingual WorldCat presented by Janifer Gatenby Karen Smith - - PowerPoint PPT Presentation

IFLA, Singapore, 2013-08-19 Multilingual WorldCat presented by Janifer Gatenby Karen Smith Yoshimura Eric Childress Robert Bremer Janifer Gatenby JD Shipengrover Jean Godby Gail Thornburg Richard Greene Jay Weitz Jenny Toves Diane Vizine


slide-1
SLIDE 1

The world’s libraries. Connected.

Multilingual WorldCat

presented by Janifer Gatenby

IFLA, Singapore, 2013-08-19

Karen Smith Yoshimura Eric Childress Janifer Gatenby Jean Godby Richard Greene Jenny Toves Diane Vizine Goetz Robert Bremer JD Shipengrover Gail Thornburg Jay Weitz

slide-2
SLIDE 2

The world’s libraries. Connected.

WorldCat Today

  • Resources in nearly

all languages

  • Contributed by more

than 20,000 libraries worldwide

  • More than half the

database is for works not in English

slide-3
SLIDE 3

The world’s libraries. Connected.

WorldCat Today

  • Bibliographic Records
  • Hybrid records
  • Parallel records
  • Clustered at Work

level (FRBR)

slide-4
SLIDE 4

The world’s libraries. Connected.

Existing Architecture

Author s Author s

Authors

Subj Classif Subj Classif

Subj Classif

Holdin g Holdin g

Holdings

Bibliographic record

Work cluster Content cluster Manifes tation cluster

slide-5
SLIDE 5

The world’s libraries. Connected.

Complementary Initiatives

Work Level Record GLIMIR Manifestation & Content Clusters Multi-lingual Bibliographic Structure

slide-6
SLIDE 6

The world’s libraries. Connected.

Work Level Record

http://www.oclc.org/research/activities/workrecs.html

slide-7
SLIDE 7

The world’s libraries. Connected.

Create a landing page summarizing content for a work

Work Level Record: Objective

slide-8
SLIDE 8

The world’s libraries. Connected.

  • The Content Cluster
  • Enables better work record displays by reducing the number of lines that display

for large works

  • Enables a choice of format and presents the formats that could be acceptable

substitutes

  • Consolidates holdings for identical content
  • The Manifestation Cluster is important
  • Consolidates holdings at manifestation level
  • In the short term allows the record catalogued in the language of the interface to

be chosen for display

  • Reduces apparent duplication
  • Allows a more accurate count of the number of manifestations in WorldCat (as
  • pposed to the number of records)

GLIMIR

slide-9
SLIDE 9

The world’s libraries. Connected.

Creates true multi-lingual displays

  • At work and manifestation levels
  • Using all available data instead of “most appropriate

record”

  • Generates data

Corrects many of the 28 million records coded “und” Better control and linking of translations Input to refinement of work clusters Smarter data storage

Multilingual Bibliographic Structure Project

slide-10
SLIDE 10

The world’s libraries. Connected.

  • Worldcat.org selects the most appropriate record

to show to a user as representative of the work in the short result list and beyond

  • The end result will not be very satisfactory from a

multi-lingual viewpoint… here’s why

“Most appropriate” questioned

slide-11
SLIDE 11

The world’s libraries. Connected.

Which record is better to present to a German speaker?

slide-12
SLIDE 12

The world’s libraries. Connected.

Incomplete Swedish Record

slide-13
SLIDE 13

The world’s libraries. Connected.

Hybrid record

slide-14
SLIDE 14

The world’s libraries. Connected.

Most appropriate display

slide-15
SLIDE 15

The world’s libraries. Connected.

  • Work level data, mined from all associated

bibliographic records will be displayed supplemented with expression / manifestation level data as the user drills through the short to fuller versions of the metadata.

Multilingual Bibliographic Structure Project

End user interface will show works and manifestations not bibliographic records; the cataloguing client will also show bibliographic records

slide-16
SLIDE 16

The world’s libraries. Connected.

Proposed new architecture

Work

eng

fre ger jpn

Manif eng Manif eng Manif eng Manif eng Manif eng Manif eng

  • fre

Notes Contents ++

Holdin g Holdin g

Holding

Holdin g

Subj sif

Subj Classif eng

fre ger jpn Author s Author s

Authors eng

fre ger jpn

eng

fre ger jpn

eng

fre ger jpn

Translations (Language of work) Manif fre

Holding

slide-17
SLIDE 17

The world’s libraries. Connected.

  • Language tagging of elements, particularly
  • Summaries (M21 520)
  • Subject headings
  • Display in script preferred by the user if data is

available

  • Improve translated interfaces
  • Show consolidated holdings as appropriate

Important principles

slide-18
SLIDE 18

The world’s libraries. Connected.

slide-19
SLIDE 19

The world’s libraries. Connected.

slide-20
SLIDE 20

The world’s libraries. Connected.

slide-21
SLIDE 21

The world’s libraries. Connected.

slide-22
SLIDE 22

The world’s libraries. Connected.

Translations

slide-23
SLIDE 23

The world’s libraries. Connected.

  • The cream of the world’s cultural and knowledge

heritage is shared by being translated

  • WorldCat contains many rich cataloguing records

for these translations

Great works are translated

GOAL: Data mine the really good records to improve clustering, presentation, authority records and linked data

slide-24
SLIDE 24

The world’s libraries. Connected.

  • Inconsistencies causing work clusters to be

incomplete & less than optimal search results

  • Titles without subtitles
  • Different forms of uniform title or missing uniform title
  • Inverted title
  • Different coding of original and translated information

Translations

Generated uniform title authority records will overcome most of these differences without needing to edit individual records

slide-25
SLIDE 25

The world’s libraries. Connected.

  • Improve FRBR work groups
  • Made by data mining
  • Contribute to VIAF
  • Diffuse via VIAF as linked data
  • Possibility to create web page / web service

Generate uniform title authority records

slide-26
SLIDE 26
slide-27
SLIDE 27
slide-28
SLIDE 28

The world’s libraries. Connected.

slide-29
SLIDE 29

The world’s libraries. Connected.

Translation records in VIAF

  • Will enrich VIAF significantly
  • New elements - translated title and translator

Author Title Expressions in VIAF Translation count in WorldCat Atwood Blind assassin 8 31 Guevara Notas de viaje 11 Hawking Grand design 18 Lenard Grosse naturforscher 1 3 Loti Pêcheur d’Islande 1 31

slide-30
SLIDE 30

The world’s libraries. Connected.

  • Records are freely available to the world from

VIAF in

  • MARC-21
  • XML
  • RDF (linked data)
  • Just links in JSON
  • And other formats as introduced

Diffusion of Translation records

slide-31
SLIDE 31

The world’s libraries. Connected.

  • # of manifestations as
  • pposed to # of records
  • # of works that have

translations

  • Top translated authors

and works

  • And more 

We don’t know now, but soon will