Course: Creating your own corpus-driven CALL materials from A-Z

Topic outline

General
General
- Announcements Forum
- Workshop overview
  We will begin by reporting on our previous projects, the Simple English Wikipedia Corpus (2017) and the American Education Research Association (AERA) corpus (2019), as examples of corpora created using freely available text databases. These projects both collect examples of authentic language that can be used for both research and pedagogical purposes. We briefly introduce those two corpora below.
  The Simple English Wikipedia Corpus was created in 2016 from the user-contributed online encyclopedia Simple English Wikipedia (SEW). The SEW was created using simplified language, and intended to be an accessible reference for learners of English. We analyzed the vocabulary demands of the SEW using AntConc and Lextutor using vocabulary lists. We found that the vocabulary requirements of the SEW are similar to normal Wikipedia (Hendry & Sheepy, 2017).
  The American Education Research Association (AERA) corpus was created in 2018 from the AERA open access repository, which collects conference papers submitted to the AERA annual conference. We used word lists such as the Academic Word List (Browne, Culligan, & Phillips, 2013) to assess the vocabulary requirements to read submissions in each division of the AERA annual conference.
  For our workshop, we will first invite discussion of sources of authentic texts that participants could collect to build their own corpora. We will then demonstrate how to use the tools available on Lextutor to clean and compile a small corpus.
  We will apply simple analytical techniques to the Simple English Wikipedia Corpus using AntConc to:
  generate frequency lists,
  determine the most frequent vocabulary items in a given corpus, and
  use stop lists on the AntConc website to remove function words, comparatively common items, and Academic Vocabulary in the form of the AWL.
  Next, we will produce a vocabulary profile of the corpus using tools available on Lextutor, estimate its readability, and then compare two different texts from within the corpus to determine which vocabulary items they have in common.
  Last, we will explore some of the tools available as part of Lancsbox, such as the keyword tool, collocation, and colligation identification tools, to show how one can explore beyond vocabulary demands.
  Participants will be invited to explore both the Simple English Wikipedia Corpus as well as their own creations. Each section will also include relevant research and examples for how to use these techniques in the classroom. The end of the workshop will be open for participants to share their own experiences and ideas of how to better use corpora for research and pedagogical purposes.
Slides and materials
Slides and materials
- Slides and workshop materials (Google Drive link) URL
  Follow this link to access the materials you will need for the workshop.
  The provided texts are provided for educational purposes only.
Corpus Tools
Corpus Tools
- Tom Cobb's Compleat Lexical Tutor v.8.3 URL
  A complete website for learning about English and French words.
- Laurence Anthony's AntConc URL
  AntConc. A freeware corpus analysis toolkit for concordancing and text analysis.
- #LancsBox: Lancaster University corpus toolbox URL
  #LancsBox is a new-generation software package for the analysis of language data and corpora developed at Lancaster University.
Useful word lists for EFL and ESL learners
Useful word lists for EFL and ESL learners
- New General Service List URL
  The New General Service List (NGSL) is an updated and expanded version of the GSL by Michael West.
- New Academic Word List 1.0 URL
  New Academic Word List by Browne, C., Culligan, B., and Phillips, J.
- New General Service List Test, New Academic Word List Test URL
  Written tests of receptive knowledge of the word lists above.
Further reading
Further reading
Give us your feedback!
Give us your feedback!
- Please give us your feedback on the EUROCALL 2019 corpus workshop!
  We want to make future versions of this workshop as helpful as possible!
  Let us know what we can do to make it great.

Creating your own corpus-driven CALL materials from A-Z

Topic outline

General

Slides and materials

Corpus Tools

Useful word lists for EFL and ESL learners

Further reading

Give us your feedback!