Chat GPT vs an experienced ophthalmologist: evaluating chatbot writing performance in ophthalmology | Eye

Apr 01, 2025

Eye (2025)Cite this article

Metrics details

To examine the abilities of ChatGPT in writing scientific ophthalmology introductions and to compare those abilities to experienced ophthalmologists.

OpenAI web interface was utilized to interact with and prompt ChatGPT 4 for generating the introductions for the selected papers. Consequently, each paper had two introductions—one drafted by ChatGPT and the other by the original author. Ten ophthalmology specialists with a minimal experience of more than 15 years, each representing distinct subspecialties—retina, neuro-ophthalmology, oculoplastic, glaucoma, and ocular oncology were provided with the two sets of introductions without revealing the origin (ChatGPT or human author) and were tasked to evaluate the introductions.

For each type of introduction, out of 45 instances, specialists correctly identified the source 26 times (57.7%) and erred 19 times (42.2%). The misclassification rates for introductions were 25% for experts evaluating introductions from their own subspecialty while to 44.4% for experts assessed introductions outside their subspecialty domain. In the comparative evaluation of introductions written by ChatGPT and human authors, no significant difference was identified across the assessed metrics (language, data arrangement, factual accuracy, originality, data Currency). The misclassification rate (the frequency at which reviewers incorrectly identified the authorship) was highest in Oculoplastic (66.7%) and lowest in Retina (11.1%).

ChatGPT represents a significant advancement in facilitating the creation of original scientific papers in ophthalmology. The introductions generated by ChatGPT showed no statistically significant difference compared to those written by experts in terms of language, data organization, factual accuracy, originality, and the currency of information. In addition, nearly half of them being indistinguishable from the originals. Future research endeavours should explore ChatGPT-4’s utility in composing other sections of research papers and delve into the associated ethical considerations.

This is a preview of subscription content, access via your institution

Subscribe to this journal

Receive 18 print issues and online access

$259.00 per year

only $14.39 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

The data that support the findings of this study are available from Sheba Medical Center but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Sheba Medical Center.

Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell. 2023;6:1169595 https://doi.org/10.3389/frai.2023.1169595.

Article PubMed PubMed Central Google Scholar

Klang E, Levy-Mendelovich S. Evaluation of OpenAI’s large language model as a new tool for writing papers in the field of thrombosis and hemostasis. J Thromb Haemost. 2023;21:1055–8. https://doi.org/10.1016/j.jtha.2023.01.011.

Article PubMed Google Scholar

Palal D, Ghonge S, Jadav V, Rathod H. ChatGPT: A Double-Edged Sword?. Heal Serv insights. 2023;16:11786329231174338 https://doi.org/10.1177/11786329231174338.

Article Google Scholar

Gottlieb M, Kline JA, Schneider AJ, Coates WC. ChatGPT and conversational artificial intelligence: Friend, foe, or future of research?. Am J Emerg Med. 2023;70:81–83. https://doi.org/10.1016/j.ajem.2023.05.018.

Article PubMed Google Scholar

Sallam M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthc. 2023;11 https://doi.org/10.3390/healthcare11060887.

Uz C, Umay E “Dr ChatGPT”: Is it a reliable and useful source for common rheumatic diseases? Int J Rheum Dis. 2023. https://doi.org/10.1111/1756-185X.14749.

Kleebayoon A, Wiwanitkit V. Rhinoplasty Consultation with ChatGPT. Aesthetic Plast Surg. 2023. https://doi.org/10.1007/s00266-023-03394-z.

Sun GH, Hoelscher SH. The ChatGPT Storm and What Faculty Can Do. Nurse Educ. 2023;48:119–24. https://doi.org/10.1097/NNE.0000000000001390.

Article PubMed Google Scholar

Lahat A, Shachar E, Avidan B, Shatz Z, Glicksberg BS, Klang E. Evaluating the use of large language model in identifying top research questions in gastroenterology. Sci Rep. 2023;13:4164.

Sorin V, Klang E, Sklair-Levy M, Cohen I, Zippel DB, Balint Lahat N, et al. Large language model (ChatGPT) as a support tool for breast tumor board. NPJ Breast Cancer. 2023;9:44.

Şendur HN, Şendur AB, Cerit MN. ChatGPT from radiologists’ perspective. Br J Radio. 2023;96:20230203 https://doi.org/10.1259/bjr.20230203.

Article Google Scholar

Pozzessere C. Optimizing Communication of Radiation Exposure in Medical Imaging, the Radiologist Challenge. Tomography. 2023;9:717–20. https://doi.org/10.3390/tomography9020057.

Article PubMed PubMed Central Google Scholar

de Pennington N, Mole G, Lim E, Milne-Ives M, Normando E, Xue K, et al. Safety and Acceptability of a Natural Language Artificial Intelligence Assistant to Deliver Clinical Follow-up to Cataract Surgery Patients: Proposal. JMIR Res Protoc. 2021;10:e27227 https://doi.org/10.2196/27227.

Article PubMed PubMed Central Google Scholar

Singh S, Djalilian A, Ali MJ. ChatGPT and Ophthalmology: Exploring Its Potential with Discharge Summaries and Operative Notes. Semin Ophthalmol. 2023;38:1–5. https://doi.org/10.1080/08820538.2023.2209166.

Article Google Scholar

Lee JY. Can an artificial intelligence chatbot be the author of a scholarly article?. J Educ Eval Health Prof. 2023;20:6 https://doi.org/10.3352/jeehp.2023.20.6.

Article PubMed PubMed Central Google Scholar

Hill-Yardin EL, Hutchinson MR, Laycock R, Spencer SJ. A Chat(GPT) about the future of scientific publishing. Brain Behav Immun. 2023;110:152–4. https://doi.org/10.1016/j.bbi.2023.02.022.

Article PubMed Google Scholar

Salvagno M, Taccone FS, Gerli AG. Can artificial intelligence help for scientific writing?. Crit Care. 2023;27: https://doi.org/10.1186/s13054-023-04380-2.

Article PubMed PubMed Central Google Scholar

Ali MJ, Singh S. ChatGPT and scientific abstract writing: pitfalls and caution. Graefes Arch Clin Exp Ophthalmol. 2023;261:3205–6. https://doi.org/10.1007/s00417-023-06123-z.

Article PubMed Google Scholar

Sarohia GS, Nanji K, Khan M, Khalid MF, Rosenberg D, Deonarain DM, et al. Treat-and-extend versus alternate dosing strategies with anti-vascular endothelial growth factor agents to treat center involving diabetic macular edema: A systematic review and meta-analysis of 2,346 eyes. Surv Ophthalmol. 2022;67:1346–63. https://doi.org/10.1016/J.SURVOPHTHAL.2022.04.003.

Article PubMed Google Scholar

Dalvin LA, Shields CL, Ancona-Lezama DA, Yu MD, Di Nicola M, Williams BK Jr, et al. Combination of multimodal imaging features predictive of choroidal nevus transformation into melanoma. Br J Ophthalmol. 2019;103:1441–7. https://doi.org/10.1136/BJOPHTHALMOL-2018-312967.

Article PubMed Google Scholar

Hyder YF, Homer V, Thaller M, Byrne M, Tsermoulas G, Piccus R, et al. Defining the Phenotype and Prognosis of People With Idiopathic Intracranial Hypertension After Cerebrospinal Fluid Diversion Surgery. Am J Ophthalmol. 2023;250:70–81. https://doi.org/10.1016/J.AJO.2023.01.016.

Article PubMed Google Scholar

Chen RI, Purgert R, Eisengart J. Gonioscopy-Assisted Transluminal Trabeculotomy and Goniotomy, With or Without Concomitant Cataract Extraction, in Steroid-Induced and Uveitic Glaucoma: 24-Month Outcomes. J Glaucoma. 2023;32:501–10. https://doi.org/10.1097/IJG.0000000000002183.

Article PubMed Google Scholar

Tawfik HA, Dutton JJ. Debunking the Puzzle of Eyelid Apraxia: The Muscle of Riolan Hypothesis. Ophthal Plast Reconstr Surg. 2023;39:211–20. https://doi.org/10.1097/IOP.0000000000002291.

Article PubMed Google Scholar

Bernstein IA, Zhang YV, Govil D, Majid I, Chang RT, Sun Y, et al. Comparison of Ophthalmologist and Large Language Model Chatbot Responses to Online Patient Eye Care Questions. JAMA Netw Open. 2023;6:e2330320 https://doi.org/10.1001/jamanetworkopen.2023.30320.

Article PubMed PubMed Central Google Scholar

Oca MC, Meller L, Wilson K, Parikh AO, McCoy A, Chang J, et al. Bias and Inaccuracy in AI Chatbot Ophthalmologist Recommendations. Cureus. 2023;15:45911 https://doi.org/10.7759/cureus.45911.

Article Google Scholar

Salimi A, Saheb H. Large Language Models in Ophthalmology Scientific Writing: Ethical Considerations Blurred Lines or Not at All?. Am J Ophthalmol. 2023;254:177–81. https://doi.org/10.1016/j.ajo.2023.06.004.

Article PubMed Google Scholar

The AI writing on the wall. Nat Mach Intell. 2023;5:1 https://doi.org/10.1038/S42256-023-00613-9.

Chelli M, Descamps J, Lavoué V, Trojani C, Azar M, Deckert M, et al. Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis. J Med Internet Res. 2024;26:53164 https://doi.org/10.2196/53164.

Article Google Scholar

Mostafapour M, Fortier JH, Pacheco K, Murray H, Garber G. Evaluating Literature Reviews Conducted by Humans Versus ChatGPT: Comparative Study. JMIR AI. 2024;3:56537 https://doi.org/10.2196/56537.

Article Google Scholar

Májovský M, Černý M, Kasal M, Komarc M, Netuka D. Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora’s Box Has Been Opened. J Med Internet Res. 2023;25:46924 https://doi.org/10.2196/46924.

Article Google Scholar

American Academy of Ophthalmology. Residency Structure - American Academy of Ophthalmology. 2023. https://www.aao.org/medical-students/residency-program-structure.

The Royal College of Ophthalmologists. Training. The Royal College of Ophthalmologists. 2023. https://www.rcophth.ac.uk/training/.

Aguwa UT, Williams BK, Woreta FA. Diversity, equity and inclusion in ophthalmology. Curr Opin Ophthalmol. 2023;34:378–81. https://doi.org/10.1097/ICU.0000000000000970.

Article PubMed Google Scholar

Download references

These authors contributed equally: Gabriel Katz, Ofira Zloto.

Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel

Gabriel Katz, Ofira Zloto, Avner Hostovsky, Ruth Huna-Baron, Iris Ben-Bassat Mizrachi, Zvia Burgansky, Alon Skaat, Vicktoria Vishnevskia-Dai, Ido Didi Fabian, Oded Sagiv & Ayelet Priel

Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel

Gabriel Katz, Ofira Zloto, Avner Hostovsky, Ruth Huna-Baron, Iris Ben-Bassat Mizrachi, Zvia Burgansky, Alon Skaat, Vicktoria Vishnevskia-Dai, Ido Didi Fabian, Oded Sagiv & Ayelet Priel

Section of Ophthalmology, Department of Head and Neck Surgery, The University of Texas MD Anderson Cancer Center, Houston, TX, USA

Oded Sagiv

The Windreich Department of Artificial Intelligence and Human Health, Mount Sinai Medical Center, New York, NY, USA

Benjamin S. Glicksberg & Eyal Klang

The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA

Eyal Klang

You can also search for this author inPubMed Google Scholar

Conceived and designed the analysis- OZ, EK, GK. Collected the data- OZ, GK. Contributed data - OZ, GK, AH, RHB, IBBM, ZB, AS, VVD, IDF, OS, AP, BSG. Performed the analysis- EK. Wrote the paper- OZ, EK. Revise the paper- GK, AH, RHB, IBBM, ZB, AS, VVD, IDF, OS, AP, BSG.

Correspondence to Ofira Zloto.

The authors declare no competing interests.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

Katz, G., Zloto, O., Hostovsky, A. et al. Chat GPT vs an experienced ophthalmologist: evaluating chatbot writing performance in ophthalmology. Eye (2025). https://doi.org/10.1038/s41433-025-03779-1

Download citation

Received: 02 February 2024

Revised: 18 February 2025

Accepted: 20 March 2025

Published: 01 April 2025

DOI: https://doi.org/10.1038/s41433-025-03779-1

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Previous: Iris metastasis of neuroblastoma in a 3-year-old boy | Eye Next: Amazon API Gateway now supports dual-stack (IPv4 and IPv6) endpoints | AWS News Blog

Send inquiry

Send