Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

On Studying the Effect of Data Quality on Classification Performances

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS); Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne (UCA)-Institut national polytechnique Clermont Auvergne (INP Clermont Auvergne); Université Clermont Auvergne (UCA)-Université Clermont Auvergne (UCA); Institut Universitaire de Technologie - Clermont-Ferrand (IUT Clermont-Ferrand); Université Clermont Auvergne (UCA)
    • بيانات النشر:
      HAL CCSD
      Springer International Publishing
    • الموضوع:
      2022
    • Collection:
      HAL Clermont Auvergne (Université Blaise Pascal Clermont-Ferrand / Université d'Auvergne)
    • الموضوع:
    • نبذة مختصرة :
      International audience ; The field of data repairing is very active and produces a lot of repairing methods. This abundance of options can make choosing a re- pairing method hard. Our research question stems from this problem: Is it always better to repair data? Our work is placed in the context of clas- sification tasks and numerical data. We investigates our research ques- tion through five criteria: C1 the perceived difficulty of using a repairing method according to experts, C2 the impact of the level of degradation of data on accuracies and f1 scores, C3 the effectiveness of the repair- ing tool, C4 the impact of the type of error present in data and C5 the impact of the classification model used. Our main contributions are a method to evaluate the difficulty of using a repairing tool and a study of experimental results on accuracies and f1 scores investigating C2 to C5. We were able to identify two categories of error types: low impact and high impact on accuracy and f1 score. We also observed that repair- ing methods perform similarly both at very low and very high levels of errors.
    • ISBN:
      978-3-031-21752-4
      3-031-21752-7
    • Relation:
      hal-03938077; https://uca.hal.science/hal-03938077; https://uca.hal.science/hal-03938077/document; https://uca.hal.science/hal-03938077/file/IDEAL2022-paper2690.pdf
    • الرقم المعرف:
      10.1007/978-3-031-21753-1_9
    • الدخول الالكتروني :
      https://uca.hal.science/hal-03938077
      https://uca.hal.science/hal-03938077/document
      https://uca.hal.science/hal-03938077/file/IDEAL2022-paper2690.pdf
      https://doi.org/10.1007/978-3-031-21753-1_9
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.8D04F179