Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Filtering-Based Instance Selection Method for Overlapping Problem in Imbalanced Datasets

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      MDPI AG, 2021.
    • الموضوع:
      2021
    • نبذة مختصرة :
      The overlapping problem occurs when a region of the dimensional data space is shared in a similar proportion by different classes. It has an impact on a classifier’s performance due to the difficulty in correctly separating the classes. Further, an imbalanced dataset consists of a situation in which one class has more instances than another, and this is another aspect that impacts a classifier’s performance. In general, these two problems are treated separately. On the other hand, Prototype Selection (PS) approaches are employed as strategies for selecting appropriate instances from a dataset by filtering redundant and noise data, which can cause misclassification performance. In this paper, we introduce Filtering-based Instance Selection (FIS), using as a base the Self-Organizing Maps Neural Network (SOM) and information entropy. In this sense, SOM is trained with a dataset, and, then, the instances of the training set are mapped to the nearest prototype (SOM neurons). An analysis with entropy is conducted in each prototype region. From a threshold, we propose three decision methods: filtering the majority class (H-FIS (High Filter IS)), the minority class (L-FIS (Low Filter IS)), and both classes (B-FIS). The experiments using artificial and real dataset showed that the methods proposed in combination with 1NN improved the accuracy, F-Score, and G-mean values when compared with the 1NN classifier without the filter methods. The FIS approach is also compatible with the approaches mentioned in the relevant literature.
    • File Description:
      application/pdf
    • ISSN:
      2571-8800
    • Rights:
      OPEN
    • الرقم المعرف:
      edsair.doi.dedup.....9f4a221eb06c9465d59b7375127b88b1