Generative artificial intelligence for automated data extraction from unstructured medical text.
Unstructured data, such as procedure notes, contain valuable medical information that is frequently underutilized due to the labor-intensive nature of data extraction. This study aims to develop a generative artificial intelligence (GenAI) pipeline using an open-source Large Language Model (LLM) with built-in guardrails and a retry mechanism to extract data from unstructured right heart catheterization (RHC) notes while minimizing errors, including hallucinations.
Author(s): Dao, Nam, Quesada, Luisa, Hassan, Syed Moin, Campo, Monica Iturrioz, Johnson, Shelsey, Ghose, Suchandra, San José Estépar, Raúl, Waxman, Aaron, Washko, George, Rahaghi, Farbod N
DOI: 10.1093/jamiaopen/ooaf097