an ontology for digital graphematjcs and philology
play

An ontology for digital graphematjcs and philology Die - PowerPoint PPT Presentation

Paolo Monella An ontology for digital graphematjcs and philology Die (hyper-)diplomatjsche Transkriptjon und ihre Erkenntnispotentjale Bergische Universitt Wuppertal (BUW), 6 February 2020 Outline Outline Interoperability of digital


  1. Paolo Monella An ontology for digital graphematjcs and philology Die (hyper-)diplomatjsche Transkriptjon und ihre Erkenntnispotentjale Bergische Universität Wuppertal (BUW), 6 February 2020

  2. Outline

  3. Outline ● Interoperability of digital scholarly editjons (DSEs) based on diplomatjc transcriptjons ● Digital modelling (ontology) of pre-modern writjng systems Graphemes / allographs – Allographs : – capitals, ligatures, positjonal variants, emphasis etc. ● In practjce : how can grapheme/allograph modelling make my DSE more interoperable? ● Open issues

  4. Interoperability: the issue

  5. Interoperability: the issue

  6. Interoperability: the issue ● uenenū

  7. Interoperability: the issue ● uenenū ● Historical documentation Diplomatic ● Visualization ● Processing ● (Erkenntnispotentiale)

  8. Interoperability: the issue ● uenenū

  9. Interoperability: the issue ● uenenū ● venenum

  10. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum

  11. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum venenum

  12. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum

  13. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum

  14. Interoperability: the issue ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (dist. reading) ● uenenū ● venenum

  15. Interoperability: the issue ● My focus: European Medieval handwritjng ...and early print (imitatjng handwritjng) –

  16. Interoperability: the issue ● My focus: European Medieval handwritjng ...and early print (imitatjng handwritjng) – Pre-Gutenberg (and shortly afuer) – ● Alphabetjc writjng systems (so far) Latjn script (Italian, English...), Greek, Cyrillic... – No non-alphabetjc (Cuneiform, Arabic, Chinese etc.) –

  17. Interoperability: current solutjons

  18. Unicode (TEI’s recommendatjon) ● Solutjon for new digital texts ● Not enough for pre-modern writjng systems Allographs – ſ (U+017F) / s (U+0073; ASCII 115) ● Have I encoded that they correspond to each other (variants of ● grapheme <s>)?

  19. Unicode (TEI’s recommendatjon) ● Solutjon for new digital texts ● Not enough for pre-modern writjng systems Allographs – ſ (U+017F) / s (U+0073; ASCII 115) ● Have I encoded that they correspond to each other (variants of ● grapheme <s>)? Ligatures – & (U+0026; ASCII 38) ● Have I encoded that it is equivalent to “e + t” in that MS? ● Grapheme set – u (U+0075; ASCII 117) ● Have I encoded whether it “covers” (or not) <u> and <v>? ●

  20. Diplomatjc/normalized: the surrender? ● venenum Normalized ● Processing ● Search ● Collation ● NLP (lemma, PoS etc.) ● Statistics (distant reading)... ● uenenū ● Historical documentation Diplomatic ● Historical documentation ● Visualization ● Visualization ● Processing ● (Erkenntnispotentiale)

  21. Project-specifjc solutjons ● Disposable home-made solutjons ● Normalizatjon sofuware and strategies ● TEI: theory-agnostjc

  22. Interoperability through modelling

  23. Interoperability through modelling ● Scholarly discussion on modelling ● Documentjng project-specifjc modelling and normalizatjon practjces prose – formal (sofuware code, tables) – ● Shared models ● Reusable sofuware libraries

  24. An ontology for digital graphematjcs and philology

  25. Ontology Lemma Token (inflected word) Alphabeme Logograph ● Alphabetic Grapheme ● Abbreviation Mark ● is_a + Brevigraph Legature ● ● Grapheme Diacritic Abbreviation ● ● Space ● Punctuation ● Metamark ● Allograph

  26. Ontology is_a Grapheme Linguistic Gr. Textual Gr. Punc- Meta- Space Logograph Intra-verbal Gr. tuation mark {+alphabetic} {-alphabetic} Diacritic Alphabetic Abbreviation Brevigraph Grapheme Mark

  27. Ontology Lemma Token (inflected word) Alphabeme Logograph ● Alphabetic Grapheme ● Abbreviation Mark ● is_a + Brevigraph Legature ● ● Grapheme Diacritic Abbreviation ● ● Space ● Punctuation ● Metamark ● Allograph

  28. Ontology Lemma Token (inflected word) Alphabeme Logograph ● Alphabetic Grapheme ● Abbreviation Mark ● is_a + Brevigraph Legature ● ● Grapheme Diacritic Abbreviation ● ● Space ● Punctuation ● Metamark ● Allograph

  29. Digital modelling for pre-modern writjng systems

  30. Digital modelling

  31. Digital modelling ● Comparatur vel ad se vel ad alium He is compared to himself or to another ● co̊paraƐur uł adſe uładalium

  32. Digital modelling ● Comparatur vel ad se vel ad alium He is compared to himself or to another ● co̊paraƐur uł adſe uładalium

  33. Digital modelling ● Comparatur vel ad se vel ad alium He is compared to himself or to another ● co̊paraƐur uł adſe uładalium Digital modelling

  34. Digital modelling ● co̊paraƐur uł adſe uładalium

  35. A structural approach to digital modelling System <s> <t> Text ● co̊paraƐur uł adſe uładalium <x> <y> Entities <z> Analysis

  36. A structural approach to digital modelling System <s> <t> Text ● co̊paraƐur uł adſe uładalium <x> <y> Entities <z> Digital modelling Analysis

  37. Graphemes/allographs

  38. Graphemes/allographs: the commutatjon test System Comparatur vel ad se vel ad alium He is compared to himself or to another <s> <t> Text ● co̊paraƐur uł adſe uładalium <x> <y> <z>

  39. Graphemes/allographs: the commutatjon test System System <s> « τ » <t> Text ● co̊paraƐur uł adſe uładalium <x> «√» <y> <z>

  40. Graphemes/allographs: the commutatjon test <s> « τ » <t> ● co̊paraƐur uł adſe uładalium <x> «√» Commutation : Substitution : <y> → Change → No change in “denotative in “denotative <z> meaning” meaning”

  41. Graphemes/allographs: the commutatjon test Allographs Graphemes <s> « τ » <t> ● co̊paraƐur uł adſe uładalium <x> «√» Commutation : Substitution : <y> → Change → No change in “denotative in “denotative <z> meaning” meaning”

  42. Graphemes/allographs: the commutatjon test Gr Allogr t: τ | Ɛ | √ u: u | v z: z Allographs Graphemes <s> « τ » <t> ● co̊paraƐur uł adſe uładalium <x> «√» Commutation : Substitution : <y> → Change → No change in “denotative in “denotative <z> meaning” meaning”

  43. Graphemes / allographs: what to transcribe? ● What the project wants! based on its scientjfjc interests – (and on tjme / money) – ● But: framed in a larger model

  44. Saussure, pertjnence and the scribe’s toolbox

  45. Saussure, pertjnence and the scribe’s toolbox MS A a b c d e f g h i l m n o p q r s t u z · ; MS B a b c d e f g h i j l m n o p q r s t u v z . , ; : !

  46. Saussure, pertjnence and the scribe’s toolbox OCR from Teubner a b c d e f g h i l m n o p q r s t u z · ; OCR from Loeb a b c d e f g h i j l m n o p q r s t u v z . , ; : !

  47. Saussure, pertjnence and the scribe’s toolbox ● The toolbox of the scribe Defjnitjon of graphemes, allographs… – ● Writjng systems as autonomous semiotjc systems (Sampson) Not as epiphenomena of oral language (phonemes) – Mandarin / cantonese – “Opaque” orthographies (English) – “knight”, “aile”, “read”, “read” (past tense) ● Medieval MSS: pronunciatjon? – a b c d e f g h i j l m n o p q r s t u v z . , ; : !

  48. Saussure, pertjnence and the scribe’s toolbox ● “In language there are only difgerences” (Saussure) “But the statement that everything in language is negatjve is true – only if the signifjed and the signifjer are considered separately ; when we consider the sign in its totality, we have something that is positjve in its own class” a b c d e f g h i j l m n o p q r s t u v z . , ; : !

  49. Saussure, pertjnence and the scribe’s toolbox ● Can we defjne the scribe’s (graphematjc, signifjer) toolbox under complete ignorance of the linguistjc (meaning, signifjed) dimension? a b c d e f g h i j l m n o p q r s t u v z . , ; : !

  50. Saussure, pertjnence and the scribe’s toolbox ● Can we defjne the scribe’s toolbox under complete ignorance of the linguistjc dimension? a b c d e f g h i j l m n o p q r s t u v z . , ; : !

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend