FRIDAY, April 26, 2024 (HealthDay News) -- Large language models (LLMs) are approaching expert-level knowledge and reasoning skills in ophthalmology, according to a study published online April 17 in PLOS Digital Health.
Arun James Thirunavukarasu, M.B., B.Chir., from University of Oxford in the United Kingdom, and colleagues evaluated the clinical potential of state-of-the-art LLMs in ophthalmology. Responses to 87 questions were compared for GPT-3.5, GPT-4, PaLM 2, LLaMA, expert ophthalmologists, and doctors in training.
The researchers found that the performance of GPT-4 (69 percent) was superior to performance of GPT-3.5 (48 percent), LLaMA (32 percent), and PaLM 2 (56 percent) and compared favorably with expert ophthalmologists (median, 76 percent), ophthalmology trainees (median, 59 percent), and unspecialized junior doctors (median, 43 percent). Low agreement between LLMs and doctors was due to idiosyncratic differences in knowledge and reasoning, with overall consistency across individuals and type. Grading ophthalmologists preferred GPT-4 responses over GPT-3.5 due to higher accuracy and relevance.
"LLMs are approaching expert-level ophthalmological knowledge and reasoning, and may be useful for providing eye-related advice where access to health care professionals is limited," the authors write. "Further research is required to explore potential avenues of clinical deployment."
One author disclosed a patent on a deep learning system to detect retinal disease.