Assessing the performance of ChatGPT-4 and ChatGPT-4o in lung cancer diagnoses

Yang, Jinru; Cai, Xing; Dai, Xiaofang; Xie, Conghua

doi:10.1186/s12967-025-06337-1

Letter to the Editor
Open access
Published: 18 March 2025

Assessing the performance of ChatGPT-4 and ChatGPT-4o in lung cancer diagnoses

Jinru Yang¹^na1,
Xing Cai²^na1,
Xiaofang Dai² &
…
Conghua Xie^1,3

Journal of Translational Medicine volume 23, Article number: 346 (2025) Cite this article

801 Accesses
Metrics details

To the Editor,

Lung cancer is a highly invasive and prevalent disease, which is a leading cause of cancer death globally [1]. Timely diagnosis is key to improving outcomes. Pulmonary CT is essential in detecting lung cancer, relying on signs like lobulation, spiculation, pleural indentation, and vacuolar sign, which help experts assess the type, location, and progression of the disease, aiding in clinical decision-making. Recently, with the rapid development of artificial intelligence (AI) technology, the large language models (LLM) such as ChatGPT-4o, ChatGPT-4, and Google Bard introduced image-reading capabilities [2], have offered new solutions for early lung cancer diagnosis [3], particularly in low- and middle-income regions with limited clinical expertise, which can have significant impacts on cost-effectiveness and local healthcare resource allocation. Therefore, this study evaluates the accuracy and cost-effectiveness of ChatGPT versus clinical physicians in diagnosing lung cancer, using published cases.

This study, conducted in January 2025, reviewed 60 lung cancer cases. We extracted medical history, images (CT, H&E, IHC, PET/CT, etc.), and multiple-choice options from the cases to create 60 question documents. These were entered into ChatGPT-4 and 4o, prompting the models to provide the most likely and second most likely answers, along with confidence ratings. Two chief lung oncologists independently conducted blinded evaluations for comparison. For cost analysis, we used the 2023 Eurozone average labor cost (35.6 EUR/hour or 38.7 USD/hour).

After evaluating responses from ChatGPT-4o, ChatGPT-4 and two physicians, results showed that for the top diagnosis, ChatGPT-4o (73.33%) was comparable to ChatGPT-4 (60.00%) and physician-2 (88.67%) (P = 0.121, P = 0.068), but significantly lower than physician-1 (95.00%) (P = 0.001). For the top two diagnoses, ChatGPT-4o (86.67%) also showed no significant difference from ChatGPT-4 (73.33%) and physician-2 (95.00%), but was significantly lower than physician-1 (98.33%) (P = 0.015). ChatGPT-4o had higher confidence in its first diagnosis compared to ChatGPT-4 (P < 0.001), but lower than both physicians (P < 0.001). Confidence for the second diagnosis dropped but remained higher than ChatGPT-4 (P < 0.001), with no significant difference from the physicians. ChatGPT-4o and ChatGPT-4 had significantly lower time and cost (P < 0.001) compared to the doctors, with ChatGPT-4o being the fastest and most cost-effective (Fig. 1).

Our research shows that ChatGPT-4o demonstrates high accuracy in lung cancer diagnosis, nearing the performance of clinical doctors, with clear advantages in time and cost. While ChatGPT-4 struggles with longer inputs and multimodal data, ChatGPT-4o’s capabilities make it a strong tool for initial lung cancer diagnosis, especially in regions with limited medical expertise. ChatGPT-4o also maintains consistent performance, even in multitasking or emergency situations where physician accuracy might decrease. However, the study’s small sample size and limited representation of lung tumor variability are notable limitations. Larger studies and more AI models are needed for further validation. Despite these limitations, our study suggests ChatGPT-4o could serve as a low-cost, rapid diagnostic tool, aiding doctors in improving diagnostic accuracy and providing valuable guidance for non-medical professionals. It lays a foundation for future AI-assisted lung cancer diagnosis and early intervention.

Data availability

Publicly available data were analyzed in this study.

Abbreviations

AI:: Artificial intelligence
LLM:: Large language models
ChatGPT:: Chat generative pretrained transformer
CT:: Computed tomography
H&E:: Hematoxylin and eosin
IHC:: Immunohistochemistry
PET/CT:: Positron emission tomography/computed tomography

References

Leiter A, Veluswamy RR, Wisnivesky JP. The global burden of lung cancer: current status and future trends. Nat Rev Clin Oncol. 2023;20(9):624–39. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41571-023-00798-3.
Article PubMed Google Scholar
Kanjee Z, Crowe B, Rodman A. Accuracy of a generative artificial intelligence model in a complex diagnostic challenge. JAMA. 2023;330(1):78–80. https://doiorg.publicaciones.saludcastillayleon.es/10.1001/jama.2023.8288.
Article PubMed PubMed Central Google Scholar
Huang S, Yang J, Shen N, et al. Artificial intelligence in lung cancer diagnosis and prognosis: current application and future perspective. Semin Cancer Biol. 2023;89:30–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.semcancer.2023.01.006.
Article PubMed Google Scholar

Download references

Acknowledgements

We are grateful to the researchers who provided the original data.

Funding

This work was supported by the National Natural Science Foundation of China (No. 82473253).

Author information

Jinru Yang and Xing Cai contributed equally to this work.

Authors and Affiliations

Department of Pulmonary Oncology, Hubei Key Laboratory of Tumor Biological Behaviors, Hubei Cancer Clinical Study Center, Zhongnan Hospital of Wuhan University, Wuhan, 430071, Hubei, China
Jinru Yang & Conghua Xie
Cancer Center, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, Hubei, China
Xing Cai & Xiaofang Dai
Wuhan Research Center for Infectious Diseases and Cancer, Chinese Academy of Medical Sciences, Wuhan, 420071, Hubei, China
Conghua Xie

Authors

Jinru Yang
View author publications
You can also search for this author inPubMed Google Scholar
Xing Cai
View author publications
You can also search for this author inPubMed Google Scholar
Xiaofang Dai
View author publications
You can also search for this author inPubMed Google Scholar
Conghua Xie
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

(I) Conception and design: CH Xie; (II) Administrative support: CH Xie and XF Dai; (III) Collection and assembly of data: JR Yang and X Cai; (IV) Data analysis and interpretation: JR Yang and X Cai; (V) Manuscript writing: JR Yang; (VI) Final approval of manuscript: All authors.

Corresponding authors

Correspondence to Xiaofang Dai or Conghua Xie.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

All authors have completed the ICMJE uniform disclosure form. The authors have no conflicts of interest to declare.

Consent for publication

All authors have completed the ICMJE uniform disclosure form. The authors have no conflicts of interest to declare.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yang, J., Cai, X., Dai, X. et al. Assessing the performance of ChatGPT-4 and ChatGPT-4o in lung cancer diagnoses. J Transl Med 23, 346 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12967-025-06337-1

Download citation

Received: 26 February 2025
Accepted: 28 February 2025
Published: 18 March 2025
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12967-025-06337-1

Assessing the performance of ChatGPT-4 and ChatGPT-4o in lung cancer diagnoses

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Competing interests

Consent for publication

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Keywords

Journal of Translational Medicine

Contact us

Assessing the performance of ChatGPT-4 and ChatGPT-4o in lung cancer diagnoses

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Competing interests

Consent for publication

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Translational Medicine

Contact us