Evaluating the effectiveness of biomedical fine-tuning for large language models on clinical tasks.
Large language models (LLMs) have shown potential in biomedical applications, leading to efforts to fine-tune them on domain-specific data. However, the effectiveness of this approach remains unclear. This study aims to critically evaluate the performance of biomedically fine-tuned LLMs against their general-purpose counterparts across a range of clinical tasks.
Author(s): Dorfner, Felix J, Dada, Amin, Busch, Felix, Makowski, Marcus R, Han, Tianyu, Truhn, Daniel, Kleesiek, Jens, Sushil, Madhumita, Adams, Lisa C, Bressem, Keno K
DOI: 10.1093/jamia/ocaf045