Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Frequency lists of collocations from the Gigafida 2.1 corpus

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      Centre for Language Resources and Technologies, University of Ljubljana
    • الموضوع:
      2021
    • Collection:
      OLAC: Open Language Archives Community
    • نبذة مختصرة :
      Frequency lists of collocations were extracted from the Gigafida 2.1 Corpus of Written Standard Slovene (https://www.clarin.si/noske/run.cgi/corp_info?corpname=gfida21) using specialised scripts for extraction of data from syntactically parsed corpora. The lists contain collocations with absolute frequency 10 and above, split into files corresponding to 81 predefined syntactic structures. The formal description of syntactic structures with information on restrictions and representations applied to POS and dependency parsing annotations is included in the dataset. The lists are sorted according to absolute frequency of collocations and include frequency information on individual lemmas, together with the most frequent representative forms of combined lemmas. The lists also include calculation of logDice score for collocations, and the number of distinct forms of lemmas appearing in corpus hits for a particular collocation.
    • Relation:
      http://hdl.handle.net/11356/1415
    • Rights:
      Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ; https://creativecommons.org/licenses/by-sa/4.0/
    • الرقم المعرف:
      edsbas.DF302ED9