Large language models (LLMs) have shown remarkable potential in the medical domain, with specialized medical LLMs developed and applied across a variety of tasks. However, their effectiveness in medicine remains uncertain, and practical guidance for their deployment in specific applications is still limited. In this talk, I will present a benchmark study evaluating four representative LLMs on 12 BioNLP datasets. I will also talk about domain-specific applications of medical LLMs, focusing on (1) ophthalmology-related clinical tasks and (2) automated response generation for outpatient queries.
Watch the Recording
Presenter