Reporting of demographic data and representativeness in machine learning models using electronic health records.
The development of machine learning (ML) algorithms to address a variety of issues faced in clinical practice has increased rapidly. However, questions have arisen regarding biases in their development that can affect their applicability in specific populations. We sought to evaluate whether studies developing ML models from electronic health record (EHR) data report sufficient demographic data on the study populations to demonstrate representativeness and reproducibility.
Author(s): Bozkurt, Selen, Cahan, Eli M, Seneviratne, Martin G, Sun, Ran, Lossio-Ventura, Juan A, Ioannidis, John P A, Hernandez-Boussard, Tina
DOI: 10.1093/jamia/ocaa164