A Less Distant Future Sanskrit Texts for Scholarly Communities in - - PowerPoint PPT Presentation

a less distant future
SMART_READER_LITE
LIVE PREVIEW

A Less Distant Future Sanskrit Texts for Scholarly Communities in - - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . . . A Less Distant Future Sanskrit Texts for Scholarly Communities in the Digital Age Andrew Ollett May 24, 2017 . . . . . . . . . . . . . . . . . . . . . . . This work is


slide-1
SLIDE 1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Less Distant Future

Sanskrit Texts for Scholarly Communities in the Digital Age Andrew Ollett May 24, 2017

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

slide-2
SLIDE 2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Table of Contents

Machine-readable texts Scholar-readable texts Community-readable texts

slide-3
SLIDE 3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Table of Contents

Machine-readable texts Scholar-readable texts Community-readable texts

slide-4
SLIDE 4

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Sanskrit ślōka (anuṣṭubh)

<lg> <l>वशुानदेहाय िऽवेददयचुषे ।</l> <l>ौेयःूानमाय नमः सोमाधधारणे ॥</l> </lg>

slide-5
SLIDE 5

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Sanskrit ślōka (śārdūlavikrīḍita) िवाता मुिनवयसूिषु िवधाो िवधोतसाम् आचायिवशदं िविविवषयाा वािपताः । िक ं ता िवचायमायामथते माग िनसगले नानोदाहरणैु ताः िवशदीकतु वतामहे ॥

slide-6
SLIDE 6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Sanskrit ślōka (śārdūlavikrīḍita) with lines

<lg> <l>वयाता मुनवयसूषु वधािःतॐो वधॐोतसाम ्</l> <l>आचायवशदं वववषयाःता यवःथापताः ।</l> <l>क ं तऽािःत वचायमायामिथते माग नसगवले</l> <l>नानोदाहरणैःतु ताः ूवशदकतु ूवतामहे ॥</l> </lg>

slide-7
SLIDE 7

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Sanskrit ślōka (śārdūlavikrīḍita) with segments

<lg> <l> <seg type="quarter">वयाता मुनवयसूषु वधािःतॐो वधॐोतसाम ्</seg> <seg type="quarter">आचायवशदं वववषयाःता यवःथापताः ।</seg> </l> <l> <seg type="quarter">क ं तऽािःत वचायमायामिथते माग नसगवले</seg> <seg type="quarter">नानोदाहरणैःतु ताः ूवशदकतु ूवतामहे ॥</seg> </l> </lg>

slide-8
SLIDE 8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Sanskrit ślōka (śārdūlavikrīḍita) with line breaks

<lg> <l>वयाता मुनवयसूषु वधािःतॐो वधॐोतसाम ्<lb/> आचायवशदं वववषयाःता यवःथापताः ।</l> <l>क ं तऽािःत वचायमायामिथते माग नसगवले<lb/> नानोदाहरणैःतु ताः ूवशदकतु ूवतामहे ॥</l> </lg>

slide-9
SLIDE 9

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Sanskrit ślōka (śārdūlavikrīḍita) with caesuras:

<lg> <l>वयाता मुनवयसूषु वधािःतॐो वधॐोतसाम ्<caesura/> आचायवशदं वववषयाःता यवःथापताः ।</l> <l>क ं तऽािःत वचायमायामिथते माग नसगवले<caesura/> नानोदाहरणैःतु ताः ूवशदकतु ूवतामहे ॥</l> </lg>

slide-10
SLIDE 10

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Getting Inside Texts: Structure

<div type="adhyaya" n="1"> <!-- etc. etc. !--> <div type="pada" n="4"> <div type="adhikarana" n="1"> <head type="toc">उदधकरणम ्</head> <div type="sutra" n="1"> <p>एवं ःमृितसहतःय वेदःय ूामाये िसे <!-- etc. etc. !-->

slide-11
SLIDE 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Getting Inside Texts: Structure

▶ Navigation: Link from outside of SARIT, or navigate

within SARIT, to (e.g.) “the fjrst sūtra in the fourth pāda

  • f the fjrst adhyāya of the Tantravārttika”

▶ interpret canonical references like TaVā:1.4.1 as XPath

references

▶ Alignment: Give all of the commentaries to the fjrst

sūtra etc.

▶ requires specifying and interpreting relations ▶ maybe also a “mapping” from one text to another

slide-12
SLIDE 12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Alignment: Sharing Ancient Wisdoms

slide-13
SLIDE 13

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Alignment: Sharing Ancient Wisdoms

slide-14
SLIDE 14

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Getting Inside Texts: Words

▶ Word-analysis is necessary:

▶ Lemmatization ▶ Morphological analysis ▶ Dictionary integration ▶ Dependency relations ▶ Statistical language models

slide-15
SLIDE 15

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Getting Inside Texts: Words

▶ Given an unanalyzed text as input, rule-based approaches

can never output a reliably analyzed text.

▶ It can only be done with machine learning. ▶ How would we relate the input and output anyway?

▶ derivative fjle using TEI <choice> ▶ a set of annotations (using the OA framework)

slide-16
SLIDE 16

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Getting there

▶ We’re almost there anyway. ▶ Make all texts objects (texts and their constituent parts)

referenceable by using URIs and a canonical reference system.

▶ Follow Linked Data practices. ▶ Wait for someone else to solve the word-analysis problem.

slide-17
SLIDE 17

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Evidence

slide-18
SLIDE 18

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Table of Contents

Machine-readable texts Scholar-readable texts Community-readable texts

slide-19
SLIDE 19

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Human-readability

Reality check

Only a very small fraction of scholars in South Asian studies will ever really learn TEI and the (ever-expanding) set of standards and technologies that goes along with it. we need to always make sure that our texts can be used (and ideally improved or contributed!) by “Ordinary Working Sanskritists.”

slide-20
SLIDE 20

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Human-readability

Reality check

Only a very small fraction of scholars in South Asian studies will ever really learn TEI and the (ever-expanding) set of standards and technologies that goes along with it. → we need to always make sure that our texts can be used (and ideally improved or contributed!) by “Ordinary Working Sanskritists.”

slide-21
SLIDE 21

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Scholar-readability

Reality check

Scholars want more information than you’ll ever be able to supply them with, and they resent having to use your website, mostly because they don’t like the way it looks.

slide-22
SLIDE 22

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Two principles

Don’t make a mess

नाक ु लता भवेाचत्

Don’t leave anything out

नावहीयेत िक ं चन

slide-23
SLIDE 23

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Two principles

Don’t make a mess

नाक ु लता भवेाचत्

Don’t leave anything out

नावहीयेत िक ं चन

slide-24
SLIDE 24

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Transforming TEI: Earlier approaches

▶ Directly style TEI with CSS (Philologic) ▶ Transform TEI to HTML with XSL ▶ Transform TEI to HTML with XQuery (current SARIT

website)

slide-25
SLIDE 25

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Transforming TEI: The new approach

▶ ODD: The same fjle that specifjes the structural

requirements of a text (the schema) also specifjes the behavior of the text’s elements (as well as the documentation).

▶ These ‘abstract’ behaviors are then used to automatically

generate stylesheets for transformations to difgerent media.

▶ One set of instructions for HTML, PDF (through LaTeX),

EPUB, etc.

▶ Chained processing: “general” behaviors (e.g., from TEI

  • r EpiDoc) modifjed by more specifjc behaviors

ODD-aware publishing platform: TEI Publisher (Wolfgang Meier/eXist Solutions)

slide-26
SLIDE 26

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The ODD in action

slide-27
SLIDE 27

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

An ecosystem of modular text services

▶ Better integration of existing services (e.g., dictionaries). ▶ On-the-fmy transliteration based on user preferences. ▶ New output formats: plain text, JSON, RDF. ▶ Better support for annotation and notetaking. ▶ Always ofger page images.

slide-28
SLIDE 28

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The view from 1929

Manavalli Ramakrishna Kavi

An ideal edition requires that complete photographs of all copies of the originals with their transliteration should supplement an edition which must be in a consolidated form as some of the best Western publications are.

slide-29
SLIDE 29

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Table of Contents

Machine-readable texts Scholar-readable texts Community-readable texts

slide-30
SLIDE 30

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based pāṇḍitya

slide-31
SLIDE 31

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based annotation of text data.

slide-32
SLIDE 32

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based annotation of text data.

slide-33
SLIDE 33

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based annotation of image data.

slide-34
SLIDE 34

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based transcription.

slide-35
SLIDE 35

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based editing.

General principles:

▶ Accessibility. ▶ Transparency. ▶ Version control. ▶ Integrity.

slide-36
SLIDE 36

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based editing: Accessibility

slide-37
SLIDE 37

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based editing: Transparency

slide-38
SLIDE 38

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based editing: Version control

slide-39
SLIDE 39

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Community-based editing: Integrity