digital methods in language documentation
play

Digital Methods in Language Documentation Andrea Berez-Kroeker, - PowerPoint PPT Presentation

2019 Linguistic Institute Course 353: Digital Methods in Language Documentation Andrea Berez-Kroeker, University of Hawaii at Mnoa Colleen Fitzgerald, The University of Texas at Arlington Welcome! Andrea Berez-Kroeker Colleen Fitzgerald


  1. 2019 Linguistic Institute Course 353: Digital Methods in Language Documentation Andrea Berez-Kroeker, University of Hawaiʻi at Mānoa Colleen Fitzgerald, The University of Texas at Arlington

  2. Welcome! Andrea Berez-Kroeker Colleen Fitzgerald andrea.berez@hawaii.edu cmfitz@uta.edu

  3. Course management: ORBUND: Link to syllabus, schedule, grades, announcements Google Drive folder ( bit.ly/DigLangDocLSA19 ): All slides and readings

  4. Get to know each other: Speed data-ing! Turn to the person next to you… Find out about that person and their work Be ready to introduce your partner to the class!

  5. Day 1: What is language documentation? Why digital methods? Basics of data management for LangDoc

  6. What is language documentation?

  7. We can’t cover everything here! Grammar Culture / (Phonetics, Anthropology Equipment Phonology, Morphology, Data Syntax) management Ethics Language pedagogy Data Field Creating preservation methods learning materials Grant writing Ethnoscience

  8. For more information... http://hs.umt.edu/colang/default.php

  9. A definition of LangDoc • Language Documentation is the endeavor to create: • a long lasting • multipurpose record of • a language in use • in many genres.

  10. A language documentation is a long-lasting, multipurpose record of a language in use in many genres.

  11. Images: Pictures Transcripts of Dictionary/lexicon: Digital recordings: of people, or recordings: Database or Audio & video files notebooks, or Text files (.pdf, spreadsheet files (wav, mp3, mp4) specimens (.jpg, .txt, ELAN files, (.xls, ods, Google .tiff, .png) .doc, Google Sheet, FLEx, Docs) Toolbox, Filemaker) The documentation The purposes documentation Websites (online Movies/videos, serves dictionaries, including with etc.) subtitles Mobile apps Books Portable sound Scholarly pubs (Grammars, collections (articles, theses, dictionaries, (music, stories on dissertations) readers) DVD or mp3)

  12. Readings folder: What is LangDoc? Berge, Anna. 2010. Adequacy in documentation. In Grenoble, Lenore A. & N. Louanna Furbee (eds.), Language documentation: Practice and values , 51-65. Amsterdam: John Benjamins. Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. Linguistics 36: 161-195. Himmelmann, Nikolaus P. 2006. Language documentation: What is it good for? In Jost Gippert, Nikolaus P. Himmelmann, & Ulrike Mosel (eds.). Essentials of language documentation, 1-30. Berlin: Mouton de Gruyter. Hinton, Leanne. 2001. Language revitalization: An overview. In Hinton, Leanne & Ken Hale (eds.), The green book of language revitalization in practice, 3-17. Leiden: Brill. Holton, Gary. 2014. Mediating language documentation. In Nathan, David & Peter K. Austin (eds.), Language documentation and description, vol 12: Special issue on language documentation and archiving , 37-52. London: SOAS.

  13. Lüpke, Frederike. 2010. Research methods in language documentation. In Austin, Peter K. (ed.), Language documentation and description, vol 7 , 55-104. London: SOAS. McDonnell, Bradley, Andrea L. Berez-Kroeker & Gary Holton (eds.). 2018. Reflections on language documentation 20 years after Himmelmann 1998, Language Documentation & Conservation Special Publication 15. Honolulu: University of Hawai’i Press. Woodbury, Anthony C. 2003. Defining documentary linguistics. In Austin, Peter K. (ed.), Language documentation and description, vol 1 , 35-51. London: SOAS. Woodbury, Anthony C. 2011. Language documentation. In Austin, Peter K. & Julia Sallabank (eds.), The Cambridge handbook of endangered languages , 159-186. Cambridge: Cambridge University Press.

  14. Why a class on digital methods in language documentation?

  15. Because digital data has a few problems with longevity .

  16. Digital data problems with longevity Three central problems need to be solved: The media problem The format problem The storage and access problem

  17. The media problem The more advanced our technology becomes, the more ephemeral it is: Hard drives: 5 years < ● CDs/DVDs: 10 years < ● Cassette tapes: 30 years < ● Paper: 100-200 years (+) < ● Stone tablets: ∞ ●

  18. The media problem Not only do media degrade… ...devices for reading them become obsolete!

  19. ...requiring data rescuers and archivists to use machines like “Frank”

  20. The format (or encoding) problem Proprietary formats are controlled by intellectual property law and are subject to the whims of the developers Cease development or support ● Charge fees to access data ● Example: Hypercard dictionaries (eg Gwich’in) Data now ostensibly lost ●

  21. The storage & access problem Data cannot be effectively stored for longevity by individuals, who Lack expertise in data migration to new formats ● Inevitably lose interest, retire, or die ● Only an archive with an institutional commitment to migrating and backing up data is an effective locus of long-term storage

  22. The storage & access problem Data must be discoverable and (correctly, ethically) accessible. Without proper metadata, we don’t know anything about the data… ...or even that it exists! Data that isn’t accessible by anyone is useless.

  23. Basics of data management: File naming & organization Storage and backing up Metadata Archiving

  24. LangDoc digital workflow overview

  25. File naming & organization

  26. Think about file naming & organization early Naming and organizing your files a DAY ONE activity. If you start with a poor system, you will have trouble Locating files Distinguishing between files. You need to plan carefully.

  27. Think about file naming & organization early Being organized includes Documenting your plans for naming and organization Keeping a metadata catalog that tracks key information (more on this later today) Sticking to your plan

  28. Tips for naming and organizing files https://youtu.be/c-Mcp5ozgx0

  29. Tips for naming and organizing files In file naming: Use a unique ID that is not dependent on file structure ● File names can be semantic or non-semantic ● No spaces or funny characters ● Select a convention and stick with it. ● Check with your archive! ● Folder organization is for your convenience , but should not be used for file identification .

  30. Is this a good folder structure “Ahtna” and file naming strategy? ● Dependent on folder No. structure for identification. “Tazlina_Village” ● Could be many fish stories, so not unique. ● Files can get moved. “Louisa_Jones” “fish_story.wav”

  31. How about this one? “Ahtna” Yes! ● File name is unique ● “Semantic” file name with a lot of identifying info for your convenience “Tazlina_Village” ● File structure also only for your convenience ● May be too long for some tastes (or servers) “Louisa_Jones” “aht-LJ-20170729-fishstory.wav”

  32. Or this one? ● Embedded file structure not necessary ● Files will order Yes! alphabetically “Ahtna” “aht-LJ-20170729-fishstory.wav”

  33. Or even this one? ● File name is still unique! ● Non-semantic Yes! ● All catalog info is kept in your metadata catalog ● FILE NAMES NEVER REPLACE YOUR METADATA! “Ahtna” “ABK-0001.wav”

  34. Thinking about organizing your files & folders Most common is to organize by session ● Multimedia folders based on a single event ○ Check with your archive ● Track location of files in your metadata catalog ● (Video talks about organizing for deposit, but will also work for your own purposes) https://youtu.be/ugQrzBHOUws

  35. Storage and Backup

  36. Storage and Backup Data must be protected during collection and processing (“in the field and lab”) Protected for integrity, security and access This is not the same as your plans to keep data safe, secure, and accessible after it leaves your lab Although they may overlap in execution.

  37. Storage and Backup for data integrity Your data is vulnerable during collection and processing! Electronic/digital dangers: Broken drives, power surges, viruses Environmental dangers: Water damage, fire damage, insects, mold Human dangers: Theft, loss, overwriting, dropping/crushing

  38. A good rule to LOCKSS: remember L ots O f C opies K eep S tuff S afe! https://www.lockss.org/

  39. Storage and Backup for data integrity: LOCKSS How will you redundantly backup your data? In the field: Multiple hard drives? Flash cards? Where will they be stored?

  40. One in your cabin... ...one in your car... ...and one at the cultural center.

  41. Another approach to LOCKSS: The 3-2-1 Principle (At least) 3 copies on (At least) 2 types of storage media* with (At least) 1 off-site *Different brands of hard drive, or a hard drive and flash storage, or a hard drive and DVDs, or….

  42. What about cloud storage? Fine for convenience and sharing with collaborators. Not to be considered primary backup, ever. Considerations: data ownership, cost, security, going out of business (eg Wuala). Higher security: SpiderOak, Tresorit. Easier, more common: Google Drive, DropBox, iCloud

  43. Metadata

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend