Open-radiomics: A Collection of Standardized Datasets and a Technical Protocol for Reproducible Radiomics Machine Learning Pipelines

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Namdar, Khashayar; Wagner, Matthias W.; Ertl-Wagner, Birgit B.; Khalvati, Farzad
الموضوع:
Quantitative Biology - Quantitative Methods; Computer Science - Computer Vision and Pattern Recognition; Computer Science - Machine Learning
نوع التسجيلة:
text
اللغة:
unknown

معلومة اضافية
- الموضوع:
  2022
- Collection:
  ArXiv.org (Cornell University Library)
- نبذة مختصرة :
  Purpose: As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to investigate the effects of radiomics feature extraction on the reproducibility of the results. Materials and Methods: Experiments are conducted on BraTS 2020 open-source Magnetic Resonance Imaging (MRI) dataset that includes 369 adult patients with brain tumors (76 low-grade glioma (LGG), and 293 high-grade glioma (HGG)). Using PyRadiomics library for LGG vs. HGG classification, 288 radiomics datasets are formed; the combinations of 4 MRI sequences, 3 binWidths, 6 image normalization methods, and 4 tumor subregions. Random Forest classifiers were used, and for each radiomics dataset the training-validation-test (60%/20%/20%) experiment with different data splits and model random states was repeated 100 times (28,800 test results) and Area Under Receiver Operating Characteristic Curve (AUC) was calculated. Results: Unlike binWidth and image normalization, tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of necrotic and the non-enhancing tumor core subregions resulted in the highest AUCs (average test AUC 0.951, 95% confidence interval of (0.949, 0.952)). Although 28 settings and data splits yielded test AUC of 1, they were irreproducible. Conclusion: Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible.
- Relation:
  http://arxiv.org/abs/2207.14776
- الرقم المعرف:
  edsbas.2D7A8062

تعليقات

No Comments.

Open-radiomics: A Collection of Standardized Datasets and a Technical Protocol for Reproducible Radiomics Machine Learning Pipelines

اتصل بنا

اتبع