Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Transfer learning for abusive language detection ; Apprentissage par transfert pour la détection des abus de langage

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH); Inria Nancy - Grand Est; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD); Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA); Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA); Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS); Université de Lorraine; Irina Illina; Dominique Fohr; ANR-15-IDEX-0004,LUE,Isite LUE(2015)
    • بيانات النشر:
      CCSD
    • الموضوع:
      2023
    • Collection:
      Université de Lorraine: HAL
    • نبذة مختصرة :
      The proliferation of social media, despite its multitude of benefits, has led to the increased spread of abusive language. Such language, being typically hurtful, toxic, or prejudiced against individuals or groups, requires timely detection and moderation by online platforms. Deep learning models for detecting abusive language have displayed great levels of in-corpus performance but underperform substantially outside the training distribution. Moreover, they require a considerable amount of expensive labeled data for training.This strongly encourages the effective transfer of knowledge from the existing annotated abusive language resources that may have different distributions to low-resource corpora. This thesis studies the problem of transfer learning for abusive language detection and explores various solutions to improve knowledge transfer in cross-corpus scenarios.First, we analyze the cross-corpus generalizability of abusive language detection models without accessing the target during training. We investigate if combining topic model representations with contextual representations can improve generalizability. The association of unseen target comments with abusive language topics in the training corpus is shown to provide complementary information for a better cross-corpus transfer.Secondly, we explore Unsupervised Domain Adaptation (UDA), a type of transductive transfer learning, with access to the unlabeled target corpus. Some popular UDA approaches from sentiment classification are analyzed for cross-corpus abusive language detection. We further adapt a BERT model variant to the unlabeled target using the Masked Language Model (MLM) objective. While the latter improves the cross-corpus performance, the other UDA methods perform sub-optimally. Our analysis reveals their limitations and emphasizes the need for effective adaptation methods suited to this task.As our third contribution, we propose two DA approaches using feature attributions, which are post-hoc model explanations. Particularly, the problem ...
    • Relation:
      NNT: 2023LORR0019
    • الدخول الالكتروني :
      https://hal.univ-lorraine.fr/tel-04106135
      https://hal.univ-lorraine.fr/tel-04106135v1/document
      https://hal.univ-lorraine.fr/tel-04106135v1/file/DDOC_T_2023_0019_BOSE.pdf
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.9160BC2A