Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Addressing Label Noise in Colorectal Cancer Classification Using Cross-Entropy Loss and pLOF Methods With Stacking-Ensemble Technique

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      Wiley, 2025.
    • الموضوع:
      2025
    • Collection:
      LCC:Electronic computers. Computer science
    • نبذة مختصرة :
      Colorectal cancer is a significant global health issue, ranking as the third most common cancer and the second leading cause of cancer-related deaths worldwide. Early diagnosis of this disease is of utmost importance to increase the survival rate and enhance the healthcare system. Many machine learning (ML) and deep learning (DL) methods have been proposed to facilitate automated early diagnosis of this cancer. However, label noise in medical images and the dependence on a single model can lead to suboptimal model performance, which could potentially hinder the development of a sophisticated automated solution. In this paper, we address label noise in training data and propose a stacking-ensemble model for classifying colorectal cancer along with a trustworthy computer-aided diagnosis (CAD) system. Initially, a variety of filtering methods are extensively analyzed to determine the most suitable image representation, with subsequent data augmentation techniques. Second, a modified VGG-16 model was proposed with fine-tuning that was utilized as a feature extractor to extract meaningful features from the training samples. Third, a prediction uncertainty and probabilistic local outlier factor (pLOF) were applied to the extracted features to address the label noise issue in the training data. Fourth, we adopted a random forest–based recursive feature elimination (RF-RFE) feature selection method with various combinations of features to recursively select the most influential ones for accurate predictions. Fifth, four base ML classifiers and a metamodel were selected to build our final stacking-ensemble model, which integrates the prediction probabilities of multiple models into a meta-feature set to ensure trustworthy predictions. Finally, we integrated these strategies and deployed them into a web application to demonstrate a CAD system. This system not only predicts the disease but also generates the prediction probabilities of each class, which enhances both clarity and diagnostic insight. Our proposed model was compared with different state-of-the-art ML classifiers on a publicly available dataset and demonstrated the highest accuracy of 92.43%.
    • File Description:
      electronic resource
    • ISSN:
      1687-9732
    • Relation:
      https://doaj.org/toc/1687-9732
    • الرقم المعرف:
      10.1155/acis/6552580
    • الرقم المعرف:
      edsdoj.811ea39438d74094a468e19ace8108fd