Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Active Learning for Interactive Relation Extraction in a French Newspaper's Articles

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Sipa Ouest-France; Creating and exploiting explicit links between multimedia fragments (LinkMedia); Inria Rennes – Bretagne Atlantique; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-MEDIA ET INTERACTIONS (IRISA-D6); Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA); Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes); Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique); Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes); Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA); Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique); Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)
    • بيانات النشر:
      HAL CCSD
    • الموضوع:
      2021
    • Collection:
      Université de Rennes 1: Publications scientifiques (HAL)
    • الموضوع:
    • نبذة مختصرة :
      International audience ; Relation extraction is a subtask of natural language processing that has seen many improvements in recent years, with the advent of complex pre-trained architectures. Many of these state-of-the-art approaches are tested against benchmarks with labelled sentences containing tagged entities, and require important pretraining and fine-tuning on task-specific data. However, in a real use-case scenario such as in a newspaper company mostly dedicated to local information, relations are of varied, highly specific type, with virtually no annotated data for such relations, and many entities co-occur in a sentence without being related. We question the use of supervised state-of-the-art models in such a context, where resources such as time, computing power and human annotators are limited. To adapt to these constraints, we experiment with an active-learning based relation extraction pipeline, consisting of a binary LSTM-based lightweight model for detecting the relations that do exist, and a state-of-the-art model for relation classification. We compare several choices for classification models in this scenario, from basic word embedding averaging, to graph neural networks and Bert-based ones, as well as several active learning acquisition strategies, in order to find the most costefficient yet accurate approach in our French largest daily newspaper company's use case.
    • Relation:
      hal-03371917; https://hal.science/hal-03371917; https://hal.science/hal-03371917/document; https://hal.science/hal-03371917/file/ranlp2021.pdf
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.4EB2728F