A generalizable data assembly algorithm for infectious disease outbreaks.
During infectious disease outbreaks, health agencies often share text-based information about cases and deaths. This information is rarely machine-readable, thus creating challenges for outbreak researchers. Here, we introduce a generalizable data assembly algorithm that automatically curates text-based, outbreak-related information and demonstrate its performance across 3 outbreaks. After developing an algorithm with regular expressions, we automatically curated data from health agencies via 3 information sources: formal reports, email newsletters, and Twitter [...]
Author(s): Majumder, Maimuna S, Rose, Sherri
DOI: 10.1093/jamiaopen/ooab058