Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Cervical Cancer Diagnostics Using Machine Learning Algorithms and Class Balancing Techniques

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • الموضوع:
      2023
    • Collection:
      Repository of the University of Rijeka
    • نبذة مختصرة :
      Objectives: Cervical cancer is present in most cases of squamous cell carcinoma. In most cases, it is the result of an infection with human papillomavirus or adenocarcinoma. This type of cancer is the third most common cancer of the female reproductive organs. The risk groups for cervical cancer are mostly younger women who frequently change partners, have early sexual intercourse, are infected with human papillomavirus (HPV), and who are nicotine addicts. In most cases, the cancer is asymptomatic until it has progressed to the later stages. Cervical cancer screening rates are low, especially in developing countries and in some minority groups. Due to these facts, the introduction of a tentative cervical cancer screening based on a questionnaire can enable more diagnoses of cervical cancer in the initial stages of the disease. Methods: In this research, publicly available cervical cancer data collected on 859 female patients are used. Each sample consists of 36 input attributes and four different outputs Hinselmann, Schiller, cytology, and biopsy. Due to the significant unbalance of the data set, class balancing techniques were used, and these are the Synthetic Minority Oversampling Technique, the ADAptive SYNthetic algorithm (ADASYN), SMOTEEN, random oversampling, and SMOTETOMEK. To obtain the mentioned target outputs, multiple artificial intelligence (AI) and machine learning (ML) methods are proposed. In this research, multiple classification algorithms such as logistic regression, multilayer perceptron (MLP), support vector machine (SVM), K-nearest neighbors (KNN), and several naive Bayes methods were used. Results: From the achieved results, it can be seen that the highest performances were achieved if MLP and KNN are used in combination with Random oversampling, SMOTEEN, and SMOTETOMEK. Such an approach has resulted in mean area under the receiver operating characteristic curve (AUC¯) and mean Matthew’s correlation coefficient (MCC¯) scores of higher than 0.95, regardless of which diagnostic method was ...
    • File Description:
      application/pdf
    • Relation:
      Klinički bolnički centar Rijeka.; Clinical Hospital Center Rijeka.; https://www.unirepository.svkri.uniri.hr/islandora/object/medri:8004; https://urn.nsk.hr/urn:nbn:hr:184:975503; https://www.unirepository.svkri.uniri.hr/islandora/object/medri:8004/datastream/FILE0
    • Rights:
      info:eu-repo/semantics/openAccess ; http://creativecommons.org/licenses/by/4.0/
    • الرقم المعرف:
      edsbas.173A3653