Image 1_Evaluation of large language models in percutaneous coronary intervention decision-making.pdf

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Chengze Lin; Yongying Lan; Zi Zeng; Lingfang Zhuang; Rui Dai; Gang Lv; Yun Xie; Qi Jin; Liqun Wu; Qing Cao; Kang Chen
الموضوع:
Cardiology; clinical decision support; dynamic guidelines; large language models; model ensemble; percutaneous coronary intervention
نوع التسجيلة:
still image
اللغة:
unknown

معلومة اضافية
- الموضوع:
  2026
- Collection:
  Frontiers: Figshare
- نبذة مختصرة :
  Background Clinical decision-making for percutaneous coronary intervention (PCI) in patients with moderate-to-severe coronary stenosis is complex and sensitive to data completeness and guideline interpretation. We aimed to evaluate large language models (LLMs) for PCI support and to develop an ensemble framework for this complex decision setting. Methods In this retrospective study, 15 LLM versions were evaluated using data of 93 patients from Ruijin Hospital. A hierarchical framework was employed to assess performance across varying data inputs. To optimize accuracy, advanced grouped ensemble strategies were developed and validated via nested repeated stratified 5-fold cross-validation. Probabilistic reliability and clinical utility were quantified through calibration plots and Decision Curve Analysis (DCA). Statistical robustness was ensured by bootstrap ROC-AUC comparisons with Holm-Bonferroni adjustment and restricted cubic spline modeling to analyze age-performance interactions. Results Distinct behavioral patterns emerged across LLM families: Llama-3.3-70B-Instruct made more aggressive recommendations, whereas Grok-3 was more conservative. Holm-adjusted analysis identified significant performance gaps at age cut-points of 73, 75, and 76. A significant age-score interaction (LRT p = 0.00089) confirmed that patient age modulates model performance. The advanced ensemble strategies surpassed individual models, with an adaptive grouped ensemble achieving an F1 score of 0.921, compared to 0.807 for the best single model and 0.794 for a standard ensemble. Conclusion Tailored LLM ensembles are feasible for PCI decision support and can improve robustness. Further multicenter prospective validation and multimodal integration are needed before clinical deployment.
- الرقم المعرف:
  10.3389/fcvm.2026.1690716.s003
- الدخول الالكتروني :
  https://doi.org/10.3389/fcvm.2026.1690716.s003
  https://figshare.com/articles/figure/Image_1_Evaluation_of_large_language_models_in_percutaneous_coronary_intervention_decision-making_pdf/31922439
- Rights:
  CC BY 4.0
- الرقم المعرف:
  edsbas.B0D698C5

تعليقات

No Comments.

Image 1_Evaluation of large language models in percutaneous coronary intervention decision-making.pdf

اتصل بنا

اتبع