an ontology for digital graphematjcs and philology
play

An ontology for digital graphematjcs and philology Die - PowerPoint PPT Presentation

Paolo Monella An ontology for digital graphematjcs and philology Die (hyper-)diplomatjsche Transkriptjon und ihre Erkenntnispotentjale Bergische Universitt Wuppertal (BUW), 6 February 2020 Outline Outline Interoperability of digital


  1. Paolo Monella An ontology for digital graphematjcs and philology Die (hyper-)diplomatjsche Transkriptjon und ihre Erkenntnispotentjale Bergische Universität Wuppertal (BUW), 6 February 2020

  2. Outline

  3. Outline ● Interoperability of digital scholarly editjons (DSEs) based on diplomatjc transcriptjons ● Digital modelling (ontology) of pre-modern writjng systems Graphemes / allographs – Allographs : – capitals, ligatures, positjonal variants, emphasis etc. ● In practjce : how can grapheme/allograph modelling make my DSE more interoperable? ● Open issues

  4. Interoperability: the issue

  5. Interoperability: the issue

  6. Interoperability: the issue ● uenenū

  7. Interoperability: the issue ● uenenū ● Historical documentation Diplomatic ● Visualization ● Processing ● (Erkenntnispotentiale)

  8. Interoperability: the issue ● uenenū

  9. Interoperability: the issue ● uenenū ● venenum

  10. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum

  11. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum venenum

  12. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum

  13. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum

  14. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum

  15. Interoperability: the issue ● My focus: European Medieval handwritjng ...and early print (imitatjng handwritjng) –

  16. Interoperability: current solutjons

  17. Unicode (TEI’s recommendatjon) ● Solutjon for new digital texts ● Not enough for pre-modern writjng systems Allographs – ſ (U+017F) / s (U+0073; ASCII 115) ● Have I encoded that they correspond to each other (variants of ● grapheme <s>)?

  18. Project-specifjc solutjons ● Disposable home-made solutjons ● Normalizatjon sofuware and strategies ● TEI: theory-agnostjc

  19. Interoperability through modelling

  20. Interoperability through modelling ● Scholarly discussion on modelling ● Documentjng project-specifjc modelling and normalizatjon practjces prose – formal (sofuware code, tables) – ● Shared models ● Reusable sofuware libraries

  21. An ontology for digital graphematjcs and philology

  22. Ontology Lemma Token (inflected word) Alphabeme Logograph ● Alphabetic Grapheme ● Abbreviation Mark ● is_a + Brevigraph Legature ● ● Grapheme Diacritic Abbreviation ● ● Space ● Punctuation ● Metamark ● Allograph

  23. Ontology is_a Grapheme Linguistic Gr. Textual Gr. Punc- Meta- Space Logograph Intra-verbal Gr. tuation mark {+alphabetic} {-alphabetic} Diacritic Alphabetic Abbreviation Brevigraph Grapheme Mark

  24. Ontology Lemma Token (inflected word) Alphabeme Logograph ● Alphabetic Grapheme ● Abbreviation Mark ● is_a + Brevigraph Legature ● ● Grapheme Diacritic Abbreviation ● ● Space ● Punctuation ● Metamark ● Allograph

  25. Ontology Lemma Token (inflected word) Alphabeme Logograph ● Alphabetic Grapheme ● Abbreviation Mark ● is_a + Brevigraph Legature ● ● Grapheme Diacritic Abbreviation ● ● Space ● Punctuation ● Metamark ● Allograph

  26. Graphemes/allographs

  27. Graphemes/allographs: the commutatjon test System Comparatur vel ad se vel ad alium He is compared to himself or to another <s> <t> Text ● co̊paraƐur uł adſe uładalium <x> <y> <z>

  28. Graphemes/allographs: the commutatjon test System System <s> « τ » <t> Text ● co̊paraƐur uł adſe uładalium <x> «√» <y> <z>

  29. Graphemes/allographs: the commutatjon test <s> « τ » <t> ● co̊paraƐur uł adſe uładalium <x> «√» Commutation : Substitution : <y> → Change → No change in “denotative in “denotative <z> meaning” meaning”

  30. Graphemes/allographs: the commutatjon test Allographs Graphemes <s> « τ » <t> ● co̊paraƐur uł adſe uładalium <x> «√» Commutation : Substitution : <y> → Change → No change in “denotative in “denotative <z> meaning” meaning”

  31. Graphemes/allographs: the commutatjon test Gr Allogr t: τ | Ɛ | √ u: u | v z: z Allographs Graphemes <s> « τ » <t> ● co̊paraƐur uł adſe uładalium <x> «√» Commutation : Substitution : <y> → Change → No change in “denotative in “denotative <z> meaning” meaning”

  32. Graphemes / allographs: what to transcribe? ● What the project wants! based on its scientjfjc interests – (and on tjme / money) – ● But: framed in a larger model

  33. Can allographs have a distjnctjve value?

  34. Allographs τ τ τ τ τ Ɛ Ɛ √ √ √ √

  35. Allographs τ τ τ 1. « τ » τ τ Ɛ Ɛ 2. «Ɛ» √ √ √ 3.«√» √

  36. Capitals: allographs or graphemes? ● Cool (CA) is a cool town Geographical name ● Smith is a good smith Proper name ● ODD fjles are odd fjles Acronym OK for contemporary Western writing systems Not for classical/medieval handwriting (see later)

  37. Capitals: allographs or graphemes? ● Cool (CA) is a cool town Geographical name ● Smith is a good smith Proper name ● ODD fjles are odd fjles Acronym R. Mordenti F. Neuber P. Monella Grapheme Archi-grapheme Alphabeme <D> D D Allograph Allograph Grapheme Grapheme Grapheme Grapheme <d> <D> <d> <D> «d» «D»

  38. Sentence segmentatjon: distjnctjve value for meaning of the whole text ● I go because I have to. Stay here! I go because I have to stay here! Capitals

  39. Sentence segmentatjon: distjnctjve value for meaning of the whole text ● I go because I have to. Stay here! I go because I have to stay here! Punctuation Capitals

  40. Word segmentatjon: distjnctjve value for meaning of the whole text ● σαῦρος, ſucceſs, daſs (daß)

  41. Word segmentatjon: distjnctjve value for meaning of the whole text ● σαῦρος, ſucceſs, daſs (daß) Paulus suſtjnet me (Paolo holds me up) Paulus ſus tjnet me (Paolo the pig holds me) Positional allograph

  42. Word segmentatjon: distjnctjve value for meaning of the whole text ● σαῦρος, ſucceſs, daſs (daß) Paulus suſtjnet me (Paolo holds me up) Paulus ſus tjnet me (Paolo the pig holds me) Space Positional allograph

  43. Connotators

  44. Connotators

  45. Connotators 𝖝𝖎𝖕 ≠ WHO Connotator Pertinence Connotator “Gothic” “Gaul” (marked) (not marked)

  46. Connotators Connotators, pertjnent for the writer ● graphemes as entjtjes Emphasis ● the Evangelist wrote Respect

  47. (Non-)pertjnent allographs: positjonal variants ● Ligatures ● Non-pertjnent for the writer Allographs ● Connotators, pertjnent for (some) readers « τ » editors, paleographers, – codicologists, historians studying «Ɛ» a MS / book (Beneventan vs Caroline script, – «√» print font, ſ / s)

  48. Distjnctjve value (pertjnence) of allographs? ● Graphemes change denotatjve meaning fame vs name – Hjelmslev: denotatjve semiotjcs – ● Allographs can have other forms of distjnctjve value (pertjnence) For the writer – ● 𝖝𝖎𝖕 vs WHO Hjelmslev: connotative semiotics ● For the reader (digital editor) – Digital editors can set their own pertinence (transcription) criteria ● based on their scientific interests – E.g.: fraktur font → political connotation in WW1 –

  49. In practjce: how can grapheme/allograph modelling make my DSE more interoperable?

  50. In practjce: how can grapheme/allograph modelling make my DSE more interoperable? Manual (selective) OCR/HTT transcription (witness A) (witness B)

  51. In practjce: how can grapheme/allograph modelling make my DSE more interoperable? Allographic Vn τ er <hi>dem</hi> unter dem ſchloſs transcription schloss Manual (selective) OCR/HTT transcription (witness A) (witness B)

  52. In practjce: how can grapheme/allograph modelling make my DSE more interoperable? Allographic Vn τ er <hi>dem</hi> unter dem ſchloſs transcription schloss Manual (selective) OCR/HTT transcription (witness A) (witness B)

  53. In practjce: how can grapheme/allograph modelling make my DSE more interoperable? Unicode characters Allographic Vn τ er <hi>dem</hi> unter dem ſchloſs transcription schloss Manual (selective) OCR/HTT transcription (witness A) (witness B)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend