Validating a membership disclosure metric for synthetic health data.
One of the increasingly accepted methods to evaluate the privacy of synthetic data is by measuring the risk of membership disclosure. This is a measure of the F1 accuracy that an adversary would correctly ascertain that a target individual from the same population as the real data is in the dataset used to train the generative model, and is commonly estimated using a data partitioning methodology with a 0.5 partitioning [...]
Author(s): El Emam, Khaled, Mosquera, Lucy, Fang, Xi
DOI: 10.1093/jamiaopen/ooac083