Jelle Huisman - SIL International Typesetting language data using ConTeXt
E16 & DEtool
EuroT EX2009 & 3rd ConT EXt Meeting
E16 & DEtool EuroT EXt Meeting EX2009 & 3 rd ConT - - PowerPoint PPT Presentation
Jelle Huisman - SIL International Typesetting language data using ConTeXt E16 & DEtool EuroT EXt Meeting EX2009 & 3 rd ConT Collecting and publishing language data collect and publish: how? general information: Ethnologue language
Jelle Huisman - SIL International Typesetting language data using ConTeXt
EuroT EX2009 & 3rd ConT EXt Meeting
Collecting and publishing language data
collect and publish: how? general information: Ethnologue language specific information: dictionary, grammar description linguistic analyses: phonetics, grammatical structures, sociolin- guistics
Ethnologue: general
Handbook describing all the world's languages 16th edition was published in 2009 − Part 0: Intro, statistical summaries (50 p.) − Part 1: 6909 language descriptions (600 p.) − Part 2: Language maps (200 p.) − Part 3: Indexes (400 p.)
Ethnologue: data flow
field linguists collect data on PC based systems all data is stored in Oracle database on secure web server different output paths with XSLT − web version − T EX based output for book publication download and typeset using ConT EXt it locally (UK/US)
Ethnologue: typesetting
\startmode[proofreading] % special layout for proofreading mode \setuppapersize [letter][letter]% paper size for proofreading mode \setuplayout[backspace=18mm, width=160mm, topspace=7mm, top=0mm, header=16mm, footer=6mm, height=250mm] \stopmode
Ethnologue: typesetting
In project-file: \enablemode[book] %\enablemode[proofreading] Proofreading: is done by editorial staff in Dallas, at least 2000 pages to proofread all language descriptions
Statistical Summaries (multipage tables)
Language Description (6909)
Language Map (200)
Language Name Index (50.000)
Dictionary
! "#$%&%' ( " !" )"* $%+ ,%& -./0+$1% 2"34/3 "##/# +1 51%016 %"%+ 7%"8 913#0 :#& #$0;/8" $ 2"4$2 "04$2 <." #$%"&'"( ,%& =">/ ?1. 512/@ <." )"* +,-$ ./0"1"&'"( ,%& A$# ?1. 2"?B/ 0// (B3"C"2@ "' )"* D ,%& E$30+ ;/301% ;8.3"8 2"34/3 ;3/6 7F/# +1 >/3B0 0+3/00/# 1% +C/ 730+ 0?88"B8/ 1G +C/ 311+ $% 730+ ;/301% #."8H +3$"8 "%# ;8.3"8 G1320I J+ $0 1G+/% %1+ 2"34/# 1% +C/ ;8.3"8 /F58.0$>/ .%8/00 012/+C$%& /80/ $%+/3>/%/0 B/+9//% +C/ ;31%1.% C/ "%# +C/ >/3B <." 2, 34 "%51 64 /$ "7"1 648 98 1"& 0$: ,%& K/ "3/ %1+ "B8/ +1 C$#/ $+ G312 ?1. <." 29- "8" 3,;" "<"+ 6"1" : ,%& L/+M0 &1 "%# G18819 2? 21+C/3M0 +3"540 <." .-4 "-4* 74/ -5* "%,7 3$8$ ,%& K/ +91 9/%+ +1 05C118 $% 1.3 19% 8"%&."&/ "/" )N"IOB"* %1.%P"8Q ,%& 51BH 513/H +C/ 5/%+3"8 0/5+$1% 1G " 51B 1G 513% 13 1G ;"%#"%.0 G3.$+ "G+/3 +C/ 1.+/3 0//#0 13 4/3%/80 "3/ 3/21>/# :#& B.% B$81%& 41% 1 2"3$+" <." ./" 8,==$;Fieldworks
FW Data Notebook FW WorldPad FW Language Explorer − lexicon − interlinear tool − grammar tool − data integration
www.sil.org/computing/fieldworks/
Dictionary Express
File > Print as Dictionary ? Dictionary Express
Dictionary Express
ConT EXt-based tool for DE: DEtool data from database, in proper format, for typesetting system to produce pdf (behind the scenes)
Dictionary output
DEoutput
Dictionaries: typesetting
data from database: converters to ODF and T EX-format in proper format: using TeX-tagged Dictionary Data (T2D2) for typesetting system to produce pdf: ConT EXt-Based Library for Typesetting (CoBaLT) (hidden for user): embedded minimal ConT EXt-distribution (miniCTX) (nota bene: work in progress)
Dictionaries: sample tags
headword (hw): this is the word that this particular entry is about, pronunciation (pr): the proper pronunciation of the word written using the International Phonetic Alphabet (IPA), part of speech (ps): the grammatical function of the word, language tag (lt): the language of the definition or example, definition (de): meaning of the headword, example (ex): example of the word used in a sentence.
Dictionaries: sample entry
\Bentry \Bhw{abel}\Ehw \marking[guidewords]{abel} \Bpr{a.bl}\Epr \Bps{noun(al)}\Eps \Blt{Eng}\Elt \Bde{line, row}\Ede \Blt{Pdg}\Elt \Bde{lain}\Ede \Eentry
"7996 )"INa#$_b* ;31;
,%& (%&"%06;/1;8/ 1G
+C/ (%&"% G"2$8? 1G 8"%&."&/0H G132/38? 3/6 G/33/# +1 "0 +C/ c.4.4.4. ;/1;8/ :#& 2"%6 2/3$ K"+.+ $%"; $ &1 81%& (0$4$
Dictionaries: final
most of the required features are implemented − font selection (including the use of Graphite fonts) − basic dictionary layout and picture support some features easy to implement, others not (page wide pictures that keep floating to the next page) remaining challenge: ultra light version of ConT EXt − dealing with the Ruby dependency − stripping the T EX-tree work in progress
Comments or Questions?
www.ethnologue.com www.sil.org/computing/fieldworks jelle_huisman@sil.org