Segmenting Oral and Historical Language Data
Zarah Weiss
Introduction
Defining Sentences Written Language Bias Defining Sentential Units Historical Continuity
The LangBank T-Unit
Background Annotation Principles Basic Definition Sentential Properties Handling Ambiguities Inter-Rater Reliability
Segmenting Speech
The SegCor Project Maximal Syntactic Unit Generalizable Solutions Open Issues
Conclusion References
A Sentence is a Sentence is a Sentence? Parallels and Differences between the Segmentation of Oral and Historical Language Data
Zarah Weiss
based on work done in collaboration with Gohar Schnelle, Laura Perlitz, Carolin Odebrecht, Hagen Hirschmann, Anke Lüdeling, and Detmar Meurers
“Unit segmentation in Spoken Interaction” Segcor Workshop Orléans, June 3-5 2019
1 / 67