CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records.
Accurate extraction of breast cancer patients' phenotypes is important for clinical decision support and clinical research. This study developed and evaluated cancer domain pretrained CancerBERT models for extracting breast cancer phenotypes from clinical texts. We also investigated the effect of customized cancer-related vocabulary on the performance of CancerBERT models.
Author(s): Zhou, Sicheng, Wang, Nan, Wang, Liwei, Liu, Hongfang, Zhang, Rui
DOI: 10.1093/jamia/ocac040