GPT-4 meets or exceed human eye specialists at diagnostic questions

February 23, 2024
AI eye

The New York Eye and Ear Infirmary of Mount Sinai (NYEE) demonstrated how GPT-4 can meet or exceed human ophthalmologists in diagnosing and treating eye diseases. 

The findings, detailed in a study published in JAMA Ophthalmology, discuss how AI can assist eye specialists in their decision-making processes.

The research team at Mount Sinai involved 12 attending specialists and three senior trainees from the Department of Ophthalmology at the Icahn School of Medicine. 

They compared the responses of both the AI and human specialists to a set of questions and patient cases related to glaucoma and retina disorders. GPT-4’s responses were evaluated and were found to match or surpass those of the human specialists, particularly in glaucoma. 

Here’s a little more detail about how it worked:

  • Study setup: The research was conducted with a team of 15 ophthalmology professionals at the Mount Sinai Department of Ophthalmology, consisting of attending physicians and senior trainees specialized in glaucoma and retina diseases.
  • Data compilation: For the purpose of evaluation, the team used a well-rounded set of 20 ophthalmology questions (split evenly between glaucoma and retina topics) and 20 patient cases that were anonymized to maintain privacy. These were selected to reflect a range of cases taken from Mount Sinai clinics.
  • Use of AI: OpenAI’s GPT-4 was prompted to answer the questions and analyze the patient cases. The AI aimed to respond like a practicing ophthalmologist, using clinical shorthand where appropriate to mirror the concise style typical of clinical notes.
  • Evaluation: The study employed a rating system to assess both the accuracy and completeness of the responses from GPT-4 and the human specialists. This allowed for a direct comparison of the AI’s performance against that of trained professionals.
  • Glaucoma results: For glaucoma, GPT-4 provided highly accurate and more comprehensive responses than the human specialists. This indicates GPT-4’s strong capability in understanding and advising on glaucoma cases, potentially offering valuable support to ophthalmologists in this subspecialty.
  • Retina results: For retina-related inquiries and patient scenarios, GPT-4 matched human specialists, demonstrating its ability to diagnose and recommend treatments for retina conditions correctly. Further, GPT-4 often provided more elaborate responses, suggesting a thorough analysis and understanding of the cases, which could be especially beneficial in handling more complex or nuanced patient situations.

Dr. Andy Huang, an ophthalmology resident at NYEE and the study’s lead author, shared his insights, stating, “The performance of GPT-4 in our study was quite eye-opening. We recognized the enormous potential of this AI system from the moment we started testing it and were fascinated to observe that GPT-4 could not only assist but, in some cases, match or exceed, the expertise of seasoned ophthalmic specialists.”

Dr. Louis R. Pasquale, Deputy Chair for Ophthalmology Research and a senior author of the study, was also impressed by the results, stating, “AI was particularly surprising in its proficiency in handling both glaucoma and retina patient cases, matching the accuracy and completeness of diagnoses and treatment suggestions made by human doctors in a clinical note format.”

“Just as the AI application Grammarly can teach us how to be better writers, GPT-4 can give us valuable guidance on how to be better clinicians, especially in terms of how we document findings of patient exams.”

Dr. Huang envisions a future for AI in ophthalmology, noting its potential to support eye specialists by providing diagnostic assistance and reducing their workload, particularly in complex cases. 

AI has taken well to applications in eye health, such as in 2018 to 2020 when DeepMind used machine learning to accurately analyze high-resolution three-dimensional optical coherence tomography (OCT) to diagnose numerous conditions. 

In 2023, researchers developed an AI model to accurately detect early signs of Parkinson’s disease from eye scans.

Another was trained to detect Parkinson’s, heart disease, and other diseases, again from eye scans, and another to detect certain diseases in infants, again from retinal scans.

Join The Future


Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Sam Jeans

Sam is a science and technology writer who has worked in various AI startups. When he’s not writing, he can be found reading medical journals or digging through boxes of vinyl records.


Stay Ahead with DailyAI


Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.


*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions