international conference on language teaching and
play

International Conference on Language Teaching and Assessment, August - PowerPoint PPT Presentation

NANYANG TECHNOLOGICAL UNIVERSITY National Institute of Education THE APPLICATIONS OF CORPUS LINGUISTICS IN LANGUAGE TEACHING AND ASSESSMENT International Conference on Language Teaching and Assessment, August 21 st - 23rd, 2017, UIN Syarif


  1. NANYANG TECHNOLOGICAL UNIVERSITY National Institute of Education THE APPLICATIONS OF CORPUS LINGUISTICS IN LANGUAGE TEACHING AND ASSESSMENT – International Conference on Language Teaching and Assessment, August 21 st - 23rd, 2017, UIN Syarif Hidayatullah Jakarta. Clarence Green Nanyang Technological University NIE3-03-118, 1 Nanyang Walk, Singapore 637616 clarence.green@nie.edu.sg

  2. Outline: - A (very) brief history of Corpus Linguistics in Language Assessment - The Turn to Academic and Disciplinary Literacy - Frequency and Language Acquisition - Wordlists, word-families and lemmas - Collocations and Phrase lists - Useful assessment tools for teachers and researchers - Data driven Learning: Assessing what a learner ‘can - do’.

  3. The Applications of Corpus Linguistics in Language Teaching and Assessment - Corpus Linguistics: study of language via large, computer-readable collections of authentic text or speech. - Few areas of contemporary language teaching and assessment where the field has not had an impact (Biber, & Conrad, 2010; Römer, 2010). - This talk considers how teachers and language learners can draw on CL for data-driven learning, effective curriculum design and proficiency assessment.

  4. A (very) brief history of Corpus Linguistics and Language Assessment - Benefits of ELT informed by corpus linguistics gaining ever more recognition (Römer, 2011). - Yet, insight is quite old: CL originates in the context of language education. THORNDIKE (1921) - A psychologist: frequency as behavioral learning. - 1921-1944 frequency- based ‘wordbooks’: “ enable a teacher to know not only the general importance of each word so far as frequency of occurrence measures that, but also its importance in current popular reading” (Thorndike and Lorge 1944: 1). - Wanted high school graduation conditional on mastering most frequent 15000 English words (in his book). Learning in the classroom could be designed from more frequent word to less frequent words.

  5. The ELT/ESL Turn: The General Service List - The General Service List/GSL (West 1953) shifted the field more into second language learning. - West (1953) explicitly, though not exclusively, had adult second language learners in mind. For nearly 50 years GSL core list in Applied Linguistics. - GSL: claimed to be the 2000 most frequent headwords in English - We might call such material corpus-based rather than corpus driven.

  6. The ELT/ESL Turn - Past 30 years, increasing alignment between corpus informed material and the adult second language learner in the university stetting. - Partly a shift in English as a global lingua franca and the globalization of the university. - West’s (1953) GSL: no less than two recent updates: Brezenia & Gabslova (2013), Browne, Culligan & Phillips (2013), a “guideline for L2 vocabulary learning” - Both are publically available and free for teaching and research.

  7. The Academic Turn: The Academic Wordlist - Naturally, along with university context came interest in academic language - E.g. Coxhead’s (2000) seminal, corpus derived Academic Wordlist (AWL) - Coxhead’s (2000: 229) methodology was far more rigorous than the historical lists. For example, AWL used word families (Bauer & Nation, 1993), minimum freq and range, excluded GSL words, and was derived from a corpus of university reading material. - Pedagogical impact has been immense: 570 word families; ordered by frequency bands; covers around 10% of all academic texts.

  8. The Academic Turn: The Academic Vocabulary List - Gardner and Davies (2014: 3) critique the AWL: word families do not capture P.O.S - e.g. react (a verb meaning respond ) is in same WF as reactor (a noun, most often a source of nuclear power). - Other problems: AWL corpus used the > 50 year old GSL. - G and D (2014) thus developed Academic Vocabulary List (AVL) : A lemma based list ranked by frequency and derived from the academic subsection of the Corpus of Contemporary American English (Davies 2010) 1. study.n 2. group.n 3. system.n 4. social.j 5. provide.v 6. however.r 7. research.n 8. level.n 9. result.n 10. include.v

  9. Recent Challenges: Discipline-specific language - Same AWL frequency band: assess and economy; yet economy likely more frequent in Economics, and assess in education. - Hyland (2008), 3.4 million words of journals, dissertations in engineering, biology, applied linguistics, business “ over half the items in each list do not occur at all in any other discipline” = Recent wave of discipline-specific wordlists: business (Konstantakis 2007), medical (Lei and Liu 2016), engineers (Todd 2017), nurses (Yang 2015), sciences (Coxhead & Hirsh, 2007), agriculture (Martinez et al. 2009).

  10. Frequency, Psycholinguistics and Assessment - How many words as does a learner need to know? - Corpus linguistics has shown that the most frequent 2000 words in a wordlist of English based on a large corpus cover account for about 83% of all texts. - Approx. 95% for if a student knows top 10 thousand WF

  11. Frequency, Psycholinguistics and Assessment Why all this emphasis on frequency? George Zipf (1902 – 1950) & Zipf’s law:

  12. Frequency, Psycholinguistics and Assessment - Tells us much about the psychology of human communication: We aim to maximize efficiency and economy in language Function words tied to grammatical relations are the core building blocks of language Closed class items much more repeated than lexical items, many of which are used only once. Can also surmise frequency effects create: function words, closed class, polysemy Hapax legomena and assessment : the long tail of a distribution in which most lexical words occur only once. If I collected some essays, computed a wordlist ordered by frequency and found one student’s work had a longer Hapax tail to their Zipfian distribution, what might this suggest? A metric of proficiency. Hapax legomena

  13. Frequency, Psycholinguistics and Assessment Word Frequency Effect in psycholinguistics: words higher on the Zipf curve are easier to process. lexical decision experiments tell us frequent words/phrases derived from corpora have faster reaction time. Sensitivity to the frequency of academic words thus becomes a psychometric tool for assessing linguistic proficiency (Akbari, 2014)

  14. Put into words: 1. Humanities students were generally faster processing language than science students 2. 3rd year/graduates were generally faster than 1st years at processing academic language 3. 3rd year students faster on high freq words I their discipline. Implications: frequent experience with the language of a discipline changes the resting activation of academic words in your brain to make language processing easier. Artificial frequency inflation in reading material of words and phrases associated with a discipline could accelerate lowering the processing cost of disciplinary language & these can be derived from subject specific corpora.

  15. Implications for lang. teaching & assessment: Interim Summary - As a general rule of Language Cognition, children/adults acquire high frequency words (and grammatical patterns) first and then progress through the frequency bands. - Pedagogically intuitively selecting vocabulary for language learning is therefore highly inefficient. - Frequency of exposure reinforces cognition, makes processing easier, promotes proficiency. - If students learn vocabulary by following a band progression learning up to the most frequent 10000 word families in English, they will know ≈ 95-98% of words in any given text (Nation, 2006). - This is a natural scaffolding (i+1 in Krashen’s sense). If student has acquired the first 1000, it is safe to assume they are developmentally ready for the next 1000.

  16. Assessment Applications: Tools - Linguists have realized that proficiency can be tested by trying to determine which frequency band level a learner is currently at: 1. Vocabulary Levels Test: Lest your learners on words from within the same frequency band, e.g. the first 1000, second 1000 etc, up to 20,000 word families. http://www.lextutor.ca/tests/levels/productive/2ka.html 2 . Vocabulary Size Test: 40 questions, 10 at each of 14 thousand-levels. Should give you a good idea of the number of English words students know. http://www.lextutor.ca/tests/levels/recognition/1_14k/ Nation’s website: http://www.victoria.ac.nz/lals/about/staff/paul-nation#vocab-tests

  17. Assessment: Writing - Proficient writers should have more words from lower bands in a corpus than less proficiency writers. - Assessing this can be done via automatic corpus analysis: COCA: http://www.wordandphrase.info/analyzeText.asp

  18. What else can a corpus offer? Phraseology and Collocations - If I am strong I am powerful . But why do I drink strong tea, and never powerful tea? Collocation : given W1> pW2 “ The principle of idiom is that a language user has available to him [sic] a large number of semi- preconstructed phrases that constitute single choices, even though they might appear to be analyzable into segments” Sinclair (1991: 110) Co-occurrence is meaningful: (Conrad & Biber 2004)

  19. What do you notice about the collocates of student and book in COCA? Student: the words that you subjectively associate can be objectively extracted from corpora. Book = something that is read , written by an author , published etc - Words can often be defined by ‘the company they keep’ (Firth, 1968) - Vocabulary input in SLA can/should be contextualized by easily obtainable corpus information

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend