Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Temporal evolution of large language models (LLMs) in oncology.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • المصدر:
      Publisher: BioMed Central Country of Publication: England NLM ID: 101190741 Publication Model: Electronic Cited Medium: Internet ISSN: 1479-5876 (Electronic) Linking ISSN: 14795876 NLM ISO Abbreviation: J Transl Med Subsets: MEDLINE
    • بيانات النشر:
      Original Publication: [London] : BioMed Central, 2003-
    • الموضوع:
    • نبذة مختصرة :
      Competing Interests: Declaration. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare that they have no competing interests.
      Background: Large language models (LLMs) are increasingly being applied in healthcare; however, their performance in specialized fields, such as oncology, is subject to temporal factors, including knowledge decay and concept drift. The impact of these temporal dynamics on LLM question-answering accuracy in oncology remains inadequately evaluated. This study aims to systematically assess the temporal evolution of LLM accuracy in responding to oncology-related questions using real-world data.
      Method: We systematically collected relevant literature through 2025 by searching LLM-related keywords in PubMed, Google Scholar, and Web of Science databases. The inclusion criteria were as follows: (1) cancer-related research; (2) clear and complete question descriptions; and (3) complete answers. The final sample (n = 23) contained 614 research questions, comprising subjective questions (n = 223) and multiple-choice questions (n = 391). Following randomization of responses generated by three LLMs (ChatGPT-3.5, ChatGPT-4, and Gemini), we evaluated their accuracy across different cancer categories using both original scoring criteria and Likert scale scoring methods. Data analysis was performed using R statistical software, employing random or fixed effects models to calculate pooled mean differences (MD) and relative risks (RR) with their 95% confidence intervals (CI).
      Results: The findings demonstrated that in both subjective and objective oncology assessments, ChatGPT-3.5 (subjective questions MD = -3.30; objective questions RR = 0.92) and ChatGPT-4 (subjective questions MD = -7.17; objective questions RR = 0.93) showed declining performance trends over time, while Gemini exhibited significant improvements over time (subjective questions MD = 11.48; objective questions RR = 1.15). Notably, ChatGPT-3.5's performance on subjective questions revealed a significant turning point between March 14, 2023, and April 26, 2023, shifting from initially superior performance on newer questions to inferior performance compared with original questions, with the performance gap progressively widening.
      Conclusions: Our meta-analysis reveals temporal performance degradation in ChatGPT-3.5 and ChatGPT-4, which contrasts with the consistent improvement observed in Gemini. These findings provide essential guidance for the evidence-based deployment of LLMs in oncology.
      (© 2025. The Author(s).)
    • References:
      Am J Hematol. 2024 Jun;99(6):1205-1207. (PMID: 38602288)
      Curr Oncol. 2024 Mar 29;31(4):1817-1830. (PMID: 38668040)
      J Med Syst. 2024 Dec 27;48(1):112. (PMID: 39725770)
      Eur Urol. 2024 Jan;85(1):13-16. (PMID: 37567827)
      J Am Med Inform Assoc. 2024 Sep 1;31(9):2089-2096. (PMID: 38758655)
      Cancer Radiother. 2024 Jun;28(3):258-264. (PMID: 38866652)
      Radiat Oncol. 2025 Jun 14;20(1):101. (PMID: 40517223)
      Adv Radiat Oncol. 2023 Nov 04;9(3):101400. (PMID: 38304112)
      NEJM AI. 2024 May;1(5):. (PMID: 39131700)
      Eur Arch Otorhinolaryngol. 2024 Sep;281(9):5001-5006. (PMID: 38795148)
      J Med Syst. 2024 Feb 17;48(1):22. (PMID: 38366043)
      Dig Dis Sci. 2024 Mar;69(3):791-797. (PMID: 38267726)
      BMC Womens Health. 2024 Sep 2;24(1):482. (PMID: 39223612)
      Cureus. 2024 Aug 22;16(8):e67458. (PMID: 39310414)
      Front Oncol. 2023 Sep 04;13:1268915. (PMID: 37731643)
      Prostate Cancer Prostatic Dis. 2025 Mar;28(1):229-231. (PMID: 38228809)
      Curr Med Chem. 2016;23(20):2159-87. (PMID: 27048343)
      Int J Surg. 2024 Aug 01;110(8):4547-4551. (PMID: 38729098)
      Am J Otolaryngol. 2024 Jan-Feb;45(1):104085. (PMID: 37844413)
      Indian J Surg Oncol. 2023 Sep;14(3):537-539. (PMID: 37900654)
      Polymers (Basel). 2020 Mar 06;12(3):. (PMID: 32155695)
      JAMA Intern Med. 2022 Dec 1;182(12):1306-1312. (PMID: 36342705)
      EBioMedicine. 2025 May;115:105695. (PMID: 40305985)
      NPJ Digit Med. 2025 May 6;8(1):250. (PMID: 40325165)
      Radiology. 2024 Aug;312(2):e240320. (PMID: 39189909)
      Med Chem. 2013 Feb;9(1):11-21. (PMID: 22741786)
      Microb Pathog. 2012 Aug;53(2):66-73. (PMID: 22575887)
      Eur J Cancer. 2024 Jul;205:114100. (PMID: 38729055)
      BMJ Health Care Inform. 2023 Jun;30(1):. (PMID: 37399360)
      Ann Surg Oncol. 2024 Jun;31(6):3887-3893. (PMID: 38472675)
      J Pers Med. 2023 Oct 16;13(10):. (PMID: 37888113)
      AJR Am J Roentgenol. 2023 Nov;221(5):701-704. (PMID: 37341179)
      BJR Artif Intell. 2024 Dec 20;2(1):ubae019. (PMID: 39777117)
      Endoscopy. 2025 Mar;57(3):262-268. (PMID: 39142348)
      Curr Pharmacol Rep. 2018 Apr;4(2):145-156. (PMID: 33520605)
      JNCI Cancer Spectr. 2023 Mar 1;7(2):. (PMID: 36808255)
      Radiol Med. 2024 Oct;129(10):1463-1467. (PMID: 39138732)
      Bioengineering (Basel). 2024 Jun 27;11(7):. (PMID: 39061736)
      Arch Gynecol Obstet. 2024 Jul;310(1):537-550. (PMID: 38806945)
      J Med Internet Res. 2024 Jun 26;26:e54607. (PMID: 38764297)
      Front Oncol. 2024 May 24;14:1353031. (PMID: 38854718)
      Yearb Med Inform. 2024 Aug;33(1):90-98. (PMID: 40199294)
      Clin Transl Med. 2024 Jul;14(7):e1761. (PMID: 38997802)
      Int J Surg. 2025 Mar 01;111(3):2546-2557. (PMID: 39903546)
      BMJ Oncol. 2025 May 15;4(1):e000759. (PMID: 40519217)
      Ann Coloproctol. 2025 Jun;41(3):190-197. (PMID: 40555406)
      Trends Cancer. 2023 Oct;9(10):788-790. (PMID: 37407364)
      J Surg Oncol. 2024 Aug;130(2):188-203. (PMID: 38837375)
      Abdom Radiol (NY). 2024 Dec;49(12):4286-4294. (PMID: 39088019)
      JMIR Cancer. 2025 Mar 28;11:e65984. (PMID: 40153782)
    • Contributed Indexing:
      Keywords: Large language models; Meta-analysis; Oncology; Performance evaluation; Temporal analysis
    • الموضوع:
      Date Created: 20251104 Date Completed: 20251105 Latest Revision: 20251108
    • الموضوع:
      20251108
    • الرقم المعرف:
      PMC12584546
    • الرقم المعرف:
      10.1186/s12967-025-07227-2
    • الرقم المعرف:
      41188901