Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation.
Natural language processing (NLP) tasks are commonly decomposed into subtasks, chained together to form processing pipelines. The residual error produced in these subtasks propagates, adversely affecting the end objectives. Limited availability of annotated clinical data remains a barrier to reaching state-of-the-art operating characteristics using statistically based NLP tools in the clinical domain. Here we explore the unique linguistic constructions of clinical texts and demonstrate the loss in operating characteristics when [...]
Author(s): Ferraro, Jeffrey P, Daumé, Hal, Duvall, Scott L, Chapman, Wendy W, Harkema, Henk, Haug, Peter J
DOI: 10.1136/amiajnl-2012-001453