SAT: a Surrogate-Assisted Two-wave case boosting sampling method, with application to EHR-based association studies.
Electronic health records (EHRs) enable investigation of the association between phenotypes and risk factors. However, studies solely relying on potentially error-prone EHR-derived phenotypes (ie, surrogates) are subject to bias. Analyses of low prevalence phenotypes may also suffer from poor efficiency. Existing methods typically focus on one of these issues but seldom address both. This study aims to simultaneously address both issues by developing new sampling methods to select an optimal [...]
Author(s): Liu, Xiaokang, Chubak, Jessica, Hubbard, Rebecca A, Chen, Yong
DOI: 10.1093/jamia/ocab267