Reliable generation of privacy-preserving synthetic electronic health record time series via diffusion models.
Electronic health records (EHRs) are rich sources of patient-level data, offering valuable resources for medical data analysis. However, privacy concerns often restrict access to EHRs, hindering downstream analysis. Current EHR deidentification methods are flawed and can lead to potential privacy leakage. Additionally, existing publicly available EHR databases are limited, preventing the advancement of medical research using EHR. This study aims to overcome these challenges by generating realistic and privacy-preserving synthetic [...]
Author(s): Tian, Muhang, Chen, Bernie, Guo, Allan, Jiang, Shiyi, Zhang, Anru R
DOI: 10.1093/jamia/ocae229