Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

NLP De Luxe - Challenges for Natural Language Processing in Luxembourg

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      KLEIN, Jacques; Interdisciplinary Centre for Security, Reliability and Trust (SnT) > TruX - Trustworthy Software Engineering
    • بيانات النشر:
      Unilu - University of Luxembourg
    • الموضوع:
      2023
    • Collection:
      University of Luxembourg: ORBilu - Open Repository and Bibliography
    • نبذة مختصرة :
      The Grand Duchy of Luxembourg is a small country in Western Europe, which, despite its size, is an important global financial centre. Due to its highly multilingual population, and the fact that one of its national languages - Luxembourgish - is regarded as a low-resource language, this country lends itself naturally to a wide variety of interesting research opportunities in the domain of Natural Language Processing (NLP). This thesis discusses and addresses challenges with regard to domain-specific and language-specific NLP, using the unique linguistic situation in Luxembourg as an elaborate case study. We focus on three main topics: (I) NLP challenges present in the financial domain, specifically handling personal names in sensitive documents, (II) NLP challenges related to multilingualism, and (III) NLP challenges for low-resource languages with Luxembourgish as the language of interest. With regard to NLP challenges in the financial domain, we address the challenge of finding and anonymising names in documents. Firstly, an empirical study on the usefulness of Transformer-based deep learning models is presented on the task of Fine-Grained Named Entity Recognition. This empirical study was conducted for a wide array of domains, including the financial domain. We show that Transformer-based models, and in particular BERT models, yield the best performance for this task. We furthermore show that the performance is also strongly dependent on the domain itself, regardless of the choice of model. The automatic detection of names in text documents in turn facilitates the anonymisation of these documents. However, anonymisation can distort data and have a negative effect on models that are built on that data. We investigate the impact of anonymisation of personal names on the performance of deep learning models trained on a large number of NLP tasks. Based on our experiments, we establish which anonymisation strategy should be used to guarantee accurate NLP models. Regarding NLP challenges related to multilingualism, ...
    • File Description:
      xvi, 132
    • Relation:
      https://orbilu.uni.lu/handle/10993/54910; info:hdl:10993/54910; https://orbilu.uni.lu/bitstream/10993/54910/1/PhD_Thesis_Lothritz.pdf
    • الدخول الالكتروني :
      https://orbilu.uni.lu/handle/10993/54910
      https://orbilu.uni.lu/bitstream/10993/54910/1/PhD_Thesis_Lothritz.pdf
    • Rights:
      open access ; http://purl.org/coar/access_right/c_abf2 ; info:eu-repo/semantics/openAccess
    • الرقم المعرف:
      edsbas.B74BC