Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Spelling errors and keywords in born-digital data: a case study using the Teenage Health Freak Corpus

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • المؤلفون: Smith, Catherine; Adolphs, Svenja; Harvey, Kevin; Mullany, Louise
  • نوع التسجيلة:
    Electronic Resource
  • الدخول الالكتروني :
    http://eprints.nottingham.ac.uk/35782/
    http://eprints.nottingham.ac.uk/35782
    http://dx.doi.org/10.3366/cor.2014.0055
    doi:10.3366/cor.2014.0055
  • معلومة اضافية
    • Publisher Information:
      Edinburgh University Press
    • نبذة مختصرة :
      The abundance of language data that is now available in digital form, and the rise of distinct language varieties that are used for digital communication, means that issues of non-standard spellings and spelling errors are, in future, likely to become more prominent for compilers of corpora. This paper examines the effect of spelling variation on keywords in a born-digital corpus in order to explore the extent and impact of this variation for future corpus studies. The corpus used in this study consists of e-mails about health concerns that were sent to a health website by adolescents. Keywords are generated using the original version of the corpus and a version with spelling errors corrected, and the British National Corpus (BNC) acts as the reference corpus. The ranks of the keywords are shown to be very similar and, therefore, suggest that, depending on the research goals, keywords could be generated reliably without any need for spelling correction.
    • الموضوع:
    • الرقم المعرف:
      10.3366.cor.2014.0055
    • Availability:
      Open access content. Open access content
    • Note:
      doi:10.3366/cor.2014.0055
    • Other Numbers:
      UVN oai:eprints.nottingham.ac.uk:35782
      Smith, Catherine, Adolphs, Svenja, Harvey, Kevin and Mullany, Louise (2014) Spelling errors and keywords in born-digital data: a case study using the Teenage Health Freak Corpus. Corpora, 9 (2). pp. 137-154. ISSN 1755-1676
      doi:10.3366/cor.2014.0055
      1312899503
    • Contributing Source:
      UNIV OF NOTTINGHAM
      From OAIster®, provided by the OCLC Cooperative.
    • الرقم المعرف:
      edsoai.on1312899503
HoldingsOnline