Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Gianola, Lucie; El Boukkouri, Hicham; Grouin, Cyril; Lavergne, Thomas; Paroubek, Patrick; Zweigenbaum, Pierre
المصدر:
Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems ; EVAL4NLP, 2nd Workshop on "Evaluation & Comparison of NLP Systems", EMNLP 2021 ; https://hal.science/hal-03432331 ; EVAL4NLP, 2nd Workshop on "Evaluation & Comparison of NLP Systems", EMNLP 2021, Nov 2021, Punta Cana, Dominican Republic
الموضوع:
[SHS.LANGUE]Humanities and Social Sciences/Linguistics; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [INFO]Computer Science [cs]
نوع التسجيلة:
conference object
اللغة:
English

معلومة اضافية
- Contributors:
  Information, Langue Ecrite et Signée (ILES); Laboratoire Interdisciplinaire des Sciences du Numérique (LISN); Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Sciences et Technologies des Langues - LISN (STL); Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
- بيانات النشر:
  HAL CCSD
- الموضوع:
  2021
- الموضوع:
  Punta Cana; Dominican Republic
- نبذة مختصرة :
  International audience ; Most of the time, when dealing with a particular Natural Language Processing task, systems are compared on the basis of global statistics such as recall, precision, F1-score, etc. While such scores provide a general idea of the behavior of these systems, they ignore a key piece of information that can be useful for assessing progress and discerning remaining challenges: the relative difficulty of test instances. To address this shortcoming, we introduce the notion of differential evaluation which effectively defines a pragmatic partition of instances into gradually more difficult bins by leveraging the predictions made by a set of systems. Comparing systems along these difficulty bins enables us to produce a finergrained analysis of their relative merits, which we illustrate on two use-cases: a comparison of systems participating in a multi-label text classification task (CLEF eHealth 2018 ICD-10 coding), and a comparison of neural models trained for biomedical entity detection (BioCreative V chemical-disease relations dataset).
- Relation:
  hal-03432331; https://hal.science/hal-03432331; https://hal.science/hal-03432331/document; https://hal.science/hal-03432331/file/Differential_Evaluation__Assessing_Natural_Language_Processing_SystemPerformance_Based_Upon_Data_Resistance_to_Processing_final.pdf
- Rights:
  info:eu-repo/semantics/OpenAccess
- الرقم المعرف:
  edsbas.10C52CEF

تعليقات

No Comments.

Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing

اتصل بنا

اتبع