Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Consonant-vowel structures in the GOS 1.0 corpus 1.1

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      Centre for Language Resources and Technologies, University of Ljubljana
      Jožef Stefan Institute
    • الموضوع:
      2020
    • Collection:
      OLAC: Open Language Archives Community
    • نبذة مختصرة :
      The lists contain consonant-vowel structures of all lemmas, word forms, and standardized word forms in the GOS 1.0 Corpus of Spoken Slovene (http://hdl.handle.net/11356/1040). In each unit, its characters were converted as follows: C - consonant (in lists with finegrained character categorizations, consonants were divided into Z - sonorant, G - voiced obstruent, and K - voiceless obstruent), V - vowel, X - foreign consonant, Y - foreign vowel, S - symbol, P - punctuation, N - number, F - non-Latin-script character, ! - other. Each consonant-vowel structure also contains its frequency in the corpus (i.e. the total sum of the frequencies of all units corresponding to the consonant-vowel structure), as well as the set of all units (in the lists labeled "entire") or the set of its 30 most frequent units (in the lists labeled as "short"), along with their part-of-speech categories and their individual frequencies). They also contain the number of all unique units within the consonant-vowel structure. The lists were prepared based on frequency lists extracted from GOS 1.0 using LIST: http://hdl.handle.net/11356/1276 Note that there exists a related resource, "Consonant-vowel structures in the Gigafida 2.0 corpus", http://hdl.handle.net/11356/1289 Compared to the previous version (http://hdl.handle.net/11356/1290), this one includes fixes of several typos and substitutes all instances of "normalized forms" with the more adequate term "standardized forms" (as used in the SSJ project).
    • Relation:
      http://hdl.handle.net/11356/1290; http://hdl.handle.net/11356/1367
    • Rights:
      Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ; https://creativecommons.org/licenses/by-sa/4.0/
    • الرقم المعرف:
      edsbas.7B30CFB7