In and out: Work flows between library data and linked- data at the - - PowerPoint PPT Presentation

in and out work flows between library data and linked
SMART_READER_LITE
LIVE PREVIEW

In and out: Work flows between library data and linked- data at the - - PowerPoint PPT Presentation

BIBLIOTECA NACIONAL DE ESPAA In and out: Work flows between library data and linked- data at the National Library of Spain Semantic Web in Libraries November, 25-27th, 2019, Hamburg Ricardo Santos National Library of Spain @3186Ricardo


slide-1
SLIDE 1

In and out: Work flows between library data and linked- data at the National Library of Spain

Semantic Web in Libraries November, 25-27th, 2019, Hamburg

Ricardo Santos

National Library of Spain

BIBLIOTECA NACIONAL DE ESPAÑA

@3186Ricardo #datosbne

slide-2
SLIDE 2

BIBLIOTECA NACIONAL DE ESPAÑA 2

@3186Ricardo #datosbne

WHAT IS DATOS.BNE?

A fully functioning linked-data based gateway into the library resources.

  • Library resources (as any other catalogue)
  • Data about authors
  • Data about vocabularies
  • Data about items
  • Connected to usual library services (digital library items, loan

services, document reproduction services)

slide-3
SLIDE 3

BIBLIOTECA NACIONAL DE ESPAÑA 3

@3186Ricardo #datosbne

MAIN FEATURES

FRBR-Based Entity based Entity driven search Entity-based search results ranking Linked and enriched Interlinks between FRBR entities Linked to prominent equivalent concepts Enriched with outside data

slide-4
SLIDE 4

BIBLIOTECA NACIONAL DE ESPAÑA 4

BASIC DATA MODEL

slide-5
SLIDE 5

BIBLIOTECA NACIONAL DE ESPAÑA 5

From MARC to FRBR entities

From a 2-D structure to….

slide-6
SLIDE 6

BIBLIOTECA NACIONAL DE ESPAÑA 6

From MARC to FRBR entities

slide-7
SLIDE 7

BIBLIOTECA NACIONAL DE ESPAÑA 7

Behind the scenes

MARiMbA Data (Marc) Data analysis Mapping & sorting Linked Data generation Linking and enrichment Data generation Publication pipeline Indexing (1) Double storing Front-end (2) Triple store (3) Ranking based in FRBR relationships

(1) Elasticsearch (2) MongoDB (3) Virtuoso

slide-8
SLIDE 8

BIBLIOTECA NACIONAL DE ESPAÑA 8

Peeking inside Marimba

Ingestion and analysis of MARC21 records Mapping 1: sorting Mapping 2: model entity linking Mapping 3: data properties RDF generation Sorting is based on records: Ex: An authority record with a given tag and a given subfield combination is an entity XX 100 $a $d  C1005 Object properties are based on internal keys or inferred Data properties are based on subfields: 260 $a  P3003 (place of publication)

slide-9
SLIDE 9

BIBLIOTECA NACIONAL DE ESPAÑA 9

Dataflows

Other sources Digital Library (2%) MARC21 (95%)

DATOS.BNE.ES

slide-10
SLIDE 10

BIBLIOTECA NACIONAL DE ESPAÑA 1

Dataflows

Other sources

Identifiers from VIAF Links: sameAs

slide-11
SLIDE 11

BIBLIOTECA NACIONAL DE ESPAÑA 1 1

Dataflows

Other sources

Identifiers from VIAF Dbpedia THEY REMAIN ONLY IN DATOS.BNE

slide-12
SLIDE 12

BIBLIOTECA NACIONAL DE ESPAÑA 1 2

Enrichments and data imports

Main enrichment: EVERYDAY JOB Other enrichments and imports: Source Library system Datos.bne Sources: Libraries and beyond Data imported: identifiers, equivalencies and attributes

slide-13
SLIDE 13

BIBLIOTECA NACIONAL DE ESPAÑA 1 3

2019: Wikidata

Achievement: more than 80.000 persons authority records enriched with data from Wikidata Mapping BNE Ids with Wikidata IDs, using VIAF identifier datadump Extracting selected properties (in Spanish where available), using the Wikidata API:

  • Retrieved selected properties from every ID in JSON
  • Sorting and grouping
  • Convert to MARC21 fields according to mapping

Performed by datosbne technological partner, Memorandum Multimedia

slide-14
SLIDE 14

BIBLIOTECA NACIONAL DE ESPAÑA 1 4

2019: Wikidata

Quality check-ups: General overview  disregard property P737 (influenced by) Work on property Occupation (P106) :

  • Sorting, grouping and arranging for revision by vocabulary

experts.

  • Decision on every value:
  • Keep it
  • Align it with BNE Subject Vocabulary
  • Disregard it.
slide-15
SLIDE 15

BIBLIOTECA NACIONAL DE ESPAÑA 1 5

2019: Wikidata

Occupation quality check-up:

slide-16
SLIDE 16

BIBLIOTECA NACIONAL DE ESPAÑA 1 6

From Wikidata to datosbne

A funny thing happened on my way to datos.bne

slide-17
SLIDE 17

BIBLIOTECA NACIONAL DE ESPAÑA 1 7

From Wikidata to datosbne

A funny thing happened on my way to datos.bne

slide-18
SLIDE 18

BIBLIOTECA NACIONAL DE ESPAÑA 1 8

From Wikidata to datosbne

A funny thing happened on my way to datos.bne

slide-19
SLIDE 19

BIBLIOTECA NACIONAL DE ESPAÑA 1 9

From Wikidata to datosbne

A funny thing happened on my way to datos.bne

slide-20
SLIDE 20

BIBLIOTECA NACIONAL DE ESPAÑA 2

From Wikidata to datosbne

A funny thing happened on my way to datos.bne

slide-21
SLIDE 21

BIBLIOTECA NACIONAL DE ESPAÑA 2 1

From Wikidata to datosbne

A funny thing happened on my way to datos.bne

datos.bne Persons’ advanced search

slide-22
SLIDE 22

BIBLIOTECA NACIONAL DE ESPAÑA 2 2

From Wikidata to datosbne

A funny thing happened on my way to datos.bne

slide-23
SLIDE 23

BIBLIOTECA NACIONAL DE ESPAÑA 2 3

Wikidata : concerns

PROVENANCE Provenance is explicit in the library side of the data No mapping or device has been implement in the RDF side

slide-24
SLIDE 24

BIBLIOTECA NACIONAL DE ESPAÑA 2 4

Wikidata : concerns

QUALITY Wikidata is exposed to data vandalism Keep calm and Don’t be afraid of errors

slide-25
SLIDE 25

Ricardo Santos Head of Technical Services Department ricardo.santos@bne.es @3186ricardo Pº de Recoletos 20-22 28071 Madrid España T +34 915 807 800 www.bne.es

BIBLIOTECA NACIONAL DE ESPAÑA NATIONAL LIBRARY OF SPAIN