Visualising Annotations - NLW Transcription projects Glen Robson - - - PowerPoint PPT Presentation

visualising annotations nlw transcription projects
SMART_READER_LITE
LIVE PREVIEW

Visualising Annotations - NLW Transcription projects Glen Robson - - - PowerPoint PPT Presentation

Visualising Annotations - NLW Transcription projects Glen Robson - IIIF Technical Coordinator Image from: http://map.coflein.gov.uk/index.php?action=do_images&cache_name=&numlink=23303#tab PROJECTS Aberystwyth Student records From


slide-1
SLIDE 1

Visualising Annotations - NLW Transcription projects

Glen Robson - IIIF Technical Coordinator

Image from: http://map.coflein.gov.uk/index.php?action=do_images&cache_name=&numlink=23303#tab

slide-2
SLIDE 2

PROJECTS

  • Aberystwyth Student records
  • From 1870 to 1910
  • 8 Volumes
  • Partnership between NLW and Aberystwyth University
  • Transcription being done by Alumni in Cardiff, Aber and
  • ther places
slide-3
SLIDE 3

PROCESSING STAGES

  • 1. Map annotation body to Linked Data fields
  • 2. Data cleanup
  • 3. Reconcile
  • 4. Load to SPARQL DB
  • 5. Repeat from 2.
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10

Admission Date

from dateutil.parser import parse date = parse(value, fuzzy=True)

slide-11
SLIDE 11

DATA ERRORS

  • 2 Types:
  • Transcription errors
  • Source data errors (or lack of consistency)
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14

DATE ERRORS

  • Out of 378 pages / people
  • Only 3 invalid dates
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21

Reconciliation with WikiData

  • Mixed results
  • UK Schools have changed a lot since 1870!
  • No longer have Grammar schools
  • Many schools don’t have Wikidata (or wikipedia)

information

  • Advantage of Wikidata is you can add them.
slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25

SUMMARY

  • Annotations nice to work with
  • Can check results against an image
  • Process them as json or LinkedData
  • From data
  • Contains lots of useful data
  • Historical type of database
  • Reconciliation not easy…