Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks.
We implement 2 different multitask learning (MTL) techniques, hard parameter sharing and cross-stitch, to train a word-level convolutional neural network (CNN) specifically designed for automatic extraction of cancer data from unstructured text in pathology reports. We show the importance of learning related information extraction (IE) tasks leveraging shared representations across the tasks to achieve state-of-the-art performance in classification accuracy and computational efficiency.
Author(s): Alawad, Mohammed, Gao, Shang, Qiu, John X, Yoon, Hong Jun, Blair Christian, J, Penberthy, Lynne, Mumphrey, Brent, Wu, Xiao-Cheng, Coyle, Linda, Tourassi, Georgia
DOI: 10.1093/jamia/ocz153