Large Language Model Good for Chest Radiography

Vicuna-13B shows moderate-to-substantial agreement with existing labelers

thoracicAdobe Stock

By:

Medically Reviewed By:

TUESDAY, Oct. 17, 2023 (HealthDay News) -- Outputs of the large language model (LLM) Vicuna-13B show moderate-to-substantial agreement with existing labelers for 13 findings on chest radiography reports, according to a study published online Oct. 10 in Radiology.

Pritam Mukherjee, Ph.D., from the National Institutes of Health (NIH) Clinical Center in Bethesda, Maryland, and colleagues examined the feasibility of using an alternative LLM, Vicuna-13B, which can be run locally for labeling radiography reports, in a retrospective study that included chest radiography reports from the MIMIC-CXR and NIH data sets. Reports were assessed for 13 findings.

A total of 3,269 reports from the MIMIC-CXR and 25,596 from the NIH dataset were included in the analysis. The researchers found that the Vicuna outputs with prompt 2 showed moderate-to-substantial agreement with the labelers on the MIMIC-CXR (κ median, 0.57 with CheXpert and 0.64 with CheXbert) and NIH (κ median, 0.52 with CheXpert and 0.55 with CheXbert) datasets on average. On nine of 11 findings, Vicuna with prompt 2 performed at par with both labelers (median area under the receiver operating characteristic curve, 0.84).

"This proof-of-concept study showed that it may be feasible to use publicly available large language models, such as Vicuna-13B, run locally in a privacy-preserving manner, without any additional training or fine-tuning, to label radiology reports and that the performance may be comparable to other commonly used tools, such as CheXpert and CheXbert, particularly when binary labels for each finding are desired," the authors write.

One author disclosed ties to industry.

Abstract/Full Text

Editorial (subscription or payment may be required)

Journal